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carboxy terminus, or both. For example, any number of amino acids, ranging from 1 - 
60, can be deleted from the amino terminus of either the secreted polypeptide or the 
mature form. Similarly, any number of amino acids, ranging from 1-30, can be deleted 
from the carboxy terminus of the secreted protein or mature form. Furthermore, any 
5 combination of the above amino and carboxy terminus deletions are preferred. 
Similarly, polynucleotide fragments encoding these polypeptide fragments are also 
preferred. 

Particularly, N-terminal deletions of the polypeptide of the present invention can 
be described by the general formula m-p, where p is the total number of amino acids in 
10 the polypeptide and m is an integer from 2 to (p- 1), and where both of these integers (m 
& p) correspond to the position of the amino acid residue identified in SEQ ID NO: Y. 

Moreover, C-terminal deletions of the polypeptide of the present invention can 
also be described by the general formula 1-n, where n is an integer from 2 to (p-l), and 
again where these integers (n & p) correspond to the position of the amino acid residue 
1 5 identified in SEQ ID NO: Y. 

The invention also provides polypeptides having one or more amino acids 
deleted from both the amino and the carboxy 1 termini, which may be described 
generally as having residues m-n of SEQ ID NO: Y, where m and n are integers as 
described above. 

20 Also preferred are polypeptide and polynucleotide fragments characterized by 

structural or functional domains, such as fragments that comprise alpha-helix and alpha- 
helix forming regions, beta-sheet and beta-sheet-forming regions, turn and turn- 
forming regions, coil and coil-forming regions, hydrophilic regions, hydrophobic 
regions, alpha amphipathic regions, beta amphipathic regions, flexible regions, surface- 

25 forming regions, substrate binding region, and high antigenic index regions. 
Polypeptide fragments of SEQ ID NO: Y falling within conserved domains are 
specifically contemplated by the present invention. Moreover, polynucleotide 
fragments encoding these domains are also contemplated. 

Other preferred fragments are biologically active fragments. Biologically active 

30 fragments are those exhibidng activity similar, but not necessarily identical, to an 
activity of the polypeptide of the present invention. The biological activity of the 
fragments may include an improved desired activity, or a decreased undesirable activity. 

Epitopes & Antibodies 

35 In the present invention, "epitopes" refer to polypeptide fragments having 

antigenic or immunogenic activity in an animal, especially in a human. A preferred 
embodiment of the present invention relates to a polypeptide fragment comprising an 
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epitope, as well as the polynucleotide encoding this fragment. A region of a protein 
molecule to which an antibody can bind is defined as an "antigenic epitope." In 
contrast, an "immunogenic epitope" is defined as a part of a protein that elicits an 
antibody response. (See, for instance, Geysen et ah, Proc. Natl. Acad. Sci. USA 
5 81:3998-4002 (1983).) 

Fragments which function as epitopes may be produced by any conventional 
means. (See, e.g., Houghten, R. A., Proc. Natl. Acad. Sci. USA 82:5131-5135 
(1985) further described in U.S. Patent No. 4,631,21 1.) 

In the present invention, antigenic epitopes preferably contain a sequence of at 

10 least seven, more preferably at least nine, and most preferably between about 15 to 
about 30 amino acids. Antigenic epitopes are useful to raise antibodies, including 
monoclonal antibodies, that specifically bind the epitope. (See, for instance, Wilson et 
al.. Cell 37:767-778 (1984); Sutcliffe, J. G. et al.. Science 219:660-666 (1983).) 

Similarly, immunogenic epitopes can be used to induce antibodies according to 

15 methods well known in the an. (See, for instance, Sutcliffe et al., supra; Wilson et al., 
supra; Chow, M. et al., Proc. Natl. Acad. Sci. USA 82:910-914; and Bittle, F. J. et 
al., J. Gen. Virol. 66:2347-2354 (1985).) A preferred immunogenic epitope includes 
the secreted protein. The immunogenic epitopes may be presented together with a 
carrier protein, such as an albumin, to an animal system (such as rabbit or mouse) or, if 

20 it is long enough (at least about 25 amino acids), without a carrier. However, 

immunogenic epitopes comprising as few as 8 to 10 amino acids have been shown to be 
sufficient to raise antibodies capable of binding to, at the very least, hnear epitopes in a 
denatured polypeptide (e.g., in Western blotting.) 

As used herein, the term "antibody" (Ab) or "monoclonal antibody" (Mab) is 

25 meant to include intact molecules as well as antibody fragments (such as, for example, 
Fab and F(ab*)2 fragments) which are capable of specifically binding to protein. Fab 
and F(ab')2 fragments lack the Fc fragment of intact antibody, clear more rapidly from 
the circulation, and may have less non-specific tissue binding than an intact antibody. 
(Wahl et al., J. Nucl. Med. 24:316-325 (1983).) Thus, these fragments are preferred, 

30 as well as the products of a FAB or other immunoglobulin expression library. 
Moreover, antibodies of the present invention include chimeric, single chain, and 
humanized antibodies. 

Fusion Proteins 

35 Any polypeptide of the present invention can be used to generate fusion 

proteins. For example, the polypeptide of the present invention, when fused to a 
second protein, can be used as an antigenic tag. Antibodies raised against the 
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polypeptide of the present invention can be used to indirectly detect the second protein 
by binding to the polypeptide. Moreover, because secreted proteins target cellular 
locations based on trafficking signals, the polypeptides of the present invention can be 
used as targeting molecules once fused to other proteins. 
5 Examples of domains that can be fused to polypeptides of the present invention 

include not only heterologous signal sequences, but also other heterologous functional 
regions. The fusion does not necessarily need to be direct, but may occur through 
hnker sequences. 

Moreover, fusion proteins may also be engineered to improve characteristics of 

10 the polypeptide of the present invention. For instance, a region of additional amino 
acids, particularly charged amino acids, may be added to the N-terminus of the 
polypeptide to improve stability and persistence during purification from the host cell or 
subsequent handling and storage. Also, peptide moieties may be added to the 
polypeptide to facilitate purification. Such regions may be removed prior to final 

15 preparation of the polypeptide. The addition of peptide moieties to facilitate handling of 
polypeptides are familiar and routine techniques in the art. 

Moreover, polypeptides of the present invention, including fragments, and 
specifically epitopes, can be combined with parts of the constant domain of 
immunoglobulins (IgG), resulting in chimeric polypeptides. These fusion proteins 

20 facilitate purification and show an increased half-life in vivo. One reported example 
describes chimeric proteins consisting of the first two domains of the human CEW- 
polypeptide and various domains of the constant regions of the heavy or light chains of 
mammalian immunoglobulins. (EP A 394,827; Traunecker et al., Nature 331:84-86 
(1988).) Fusion proteins having disulfide-linked dimeric structures (due to the IgG) 

25 can also be more efficient in binding and neutralizing other molecules, than the 
monomeric secreted protein or protein fragment alone. (Fountoulakis et al., J. 
Biochem. 270:3958-3964 (1995).) 

Similarly, EP-A-0 464 533 (Canadian counterpart 2045869) discloses fusion 
proteins comprising various portions of constant region of immunoglobulin molecules 

30 together with another human protein or part thereof In many cases, the Fc part in a 
fusion protein is beneficial in therapy and diagnosis, and thus can result in, for 
example, improved pharmacokinetic properties. (EP-A 0232 262.) Alternatively, 
deleting the Fc part after the fusion protein has been expressed, detected, and purified, 
would be desired. For example, the Fc portion may hinder therapy and diagnosis if the 

35 fusion protein is used as an antigen for immunizations. In drug discovery, for 

example, human proteins, such as hIL-5, have been fused with Fc portions for die 
purpose of high-throughput screening assays to identify antagonists of hIL-5. (See, D. 
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Bennett et al.. J. Molecular Recognition 8:52-58 (1995); K. Johanson et al., J. Biol. 
Chem. 270:9459-9471 (1995).) 

Moreover, the polypeptides of the present invention can be fused to marker 
sequences, such as a peptide which facilitates purification of the fused polypeptide. In 
5 preferred embodiments, the marker amino acid sequence is a hexa-histidine peptide, 
such as the tag provided in a pQE vector (QIAGEN, Inc., 9259 Eton Avenue, 
Chatsworth, CA, 91311), among others, many of which are commercially available. 
As described in Gentz et al., Proc. Natl. Acad. Sci. USA 86:821-824 (1989), for 
instance, hexa-histidine provides for convenient purification of the fusion protein. 
10 Another peptide tag useful for purification, the "HA" tag, corresponds to an epitope 

derived from the influenza hemagglutinin protein. (Wilson et al., Cell 37:767 (1984).) 

Thus, any of these above fusions can be engineered using the polynucleotides 
or the polypeptides of the present invention. 

15 Vectors, Host Cells, and Protein Production 

The present invention also relates to vectors containing the polynucleotide of the 
present invention, host cells, and the production of polypeptides by recombinant 
techniques. The vector may be, for example, a phage, plasmid, viral, or retroviral 
vector. Retroviral vectors may be replication competent or replication defective. In the 

20 latter case, viral propagation generally will occur only in complementing host cells. 

The polynucleotides may be joined to a vector containing a selectable marker for 
propagation in a host. Generally, a plasmid vector is introduced in a precipitate, such 
as a calcium phosphate precipitate, or in a complex with a charged lipid. If the vector is 
a virus, it may be packaged in vitro using an appropriate packaging cell line and then 

25 transduced into host cells. 

The polynucleotide insert should be operatively linked to an appropriate 
promoter, such as the phage lambda PL promoter, the E. coli lac, trp, phoA and tac 
promoters, the SV40 early and late promoters and promoters of reU^oviral LTRs, to 
name a few. Other suitable promoters will be known to the skilled artisan. The 

30 expression consmicts will further contain sites for transcription initiation, termination, 
and, in the transcribed region, a ribosome binding site for translation. The coding 
portion of the transcripts expressed by the constructs will preferably include a 
translation initiating codon at the beginning and a termination codon (UAA, UGA or 
UAG) appropriately positioned at the end of the polypeptide to be translated. 

35 As indicated, the expression vectors will preferably include at least one 

selectable marker. Such markers include dihydrofolate reductase, G418 or neomycin 
resistance for eukaryotic cell culture and tetracycline, kanamycin or ampicillin resistance 
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genes for culturing in E. coli and other bacteria. Representative examples of 
appropriate hosts include, but are not limited to, bacterial cells, such as E. coli, 
Streptomyces and Salmonella typhimurium cells; fungal cells, such as yeast cells; insect 
cells such as Drosophila S2 and Spodoptera Sf9 cells; animal cells such as CHO, COS, 
5 293, and Bowes melanoma cells; and plant cells. Appropriate culture mediums and 
conditions for the above-described host cells are known in the art. 

Among vectors preferred for use in bacteria include pQE70, pQE60 and pQE-9, 
available from QIAGEN, Inc.; pBluescript vectors, Phagescript vectors, pNH8A, 
pNH16a, pNHlSA, pNH46A, available from Stratagene Cloning Systems, Inc.; and 

10 ptrc99a, pKK223-3, pKK233-3, pDR540, pRIT5 available from Pharmacia Biotech, 
Inc. Among preferred eukaryotic vectors are pWLNEO, pSV2CAT, pOG44, pXTl 
and pSG available from Stratagene; and pSVK3, pBPV, pMSG and pSVL available 
from Pharmacia. Other suitable vectors will be readily apparent to the skilled artisan. 
Introduction of the construct into the host cell can be effected by calcium 

1 5 phosphate transfection, DEAE-dextran mediated transfection, cationic lipid-mediated 
transfection, electroporation, transduction, infection, or other methods. Such methods 
are described in many standard laboratory manuals, such as Davis et al., Basic Methods 
In Molecular Biology ( 1986). It is specifically contemplated that the polypeptides of the 
present invention may in fact be expressed by a host cell lacking a recombinant vector. 

20 A polypeptide of this invention can be recovered and purified from recombinant 

cell cultures by well-known methods including ammonium sulfate or ethanol 
precipitation, acid extraction, anion or cation exchange chromatography, 
phosphocellulose chromatography, hydrophobic interaction chromatography, affinity 
chromatography, hydroxylapatite chromatography and lectin chromatography. Most 

25 preferably, high performance liquid chromatography ("HPLC") is employed for 
purification. 

Polypeptides of the present invention, and preferably the secreted form, can also 
be recovered from: products purified from natural sources, including bodily fluids, 
tissues and cells, whether directly isolated or cultured; products of chemical synthetic 

30 procedures; and products produced by recombinant techniques from a prokaryotic or 
eukaryotic host, including, for example, bacterial, yeast, higher plant, insect, and 
mammalian cells. Depending upon the host employed in a recombinant production 
procedure, the polypeptides of the present invention may be glycosylated or may be 
non-glycosylated. In addition, polypeptides of die invention may also include an initial 

35 modified methionine residue, in some cases as a result of host-mediated processes. 
Thus, it is well known in the art that the N-terminal methionine encoded by the 
translation initiation codon generally is removed with high efficiency from any protein 
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after translation in all eukaryotic cells. While the N-terminal methionine on most 
proteins also is efficiently removed in most prokaryotes, for some proteins, this 
prokaryotic removal process is inefficient, depending on the nature of the amino acid to 
which the N-terminal methionine is covalently linked. 

5 

Uses of the Polynucleotides 

Each of the polynucleotides identified herein can be used in numerous ways as 
reagents. The following description should be considered exemplary and utilizes 
known techniques. 

10 The polynucleotides of the present invention are useful for chromosome 

identification. There exists an ongoing need to identify new chromosome markers, 
since few chromosome marking reagents, based on actual sequence data (repeat 
polymorphisms), are presently available. Each polynucleotide of the present invention 
can be used as a chromosome marker. 

15 Briefly, sequences can be mapped to chromosomes by preparing PCR primers 

(preferably 15-25 bp) from the sequences shown in SEQ ID NO:X. Primers can be 
selected using computer analysis so that primers do not span more than one predicted 
exon in the genomic DNA. These primers are then used for PCR screening of somatic 
cell hybrids containing individual human chromosomes. Only those hybrids containing 

20 the human gene corresponding to the SEQ ID NO:X will yield an amplified fragment. 
Similarly, somatic hybrids provide a rapid method of PCR mapping the 
polynucleotides to particular chromosomes. Three or more clones can be assigned per 
day using a single thermal cycler. Moreover, sublocalization of the polynucleotides can 
be achieved with panels of specific chromosome fragments. Other gene mapping 

25 surategies that can be used include in situ hybridization, prescreening with labeled flow- 
sorted chromosomes, and preselection by hybridization to construct chromosome 
specific-cDNA libraries. 

Precise chromosomal location of the polynucleotides can also be achieved using 
fluorescence in situ hybridization (FISH) of a metaphase chromosomal spread. This 

30 technique uses polynucleotides as short as 500 or 600 bases; however, polynucleotides 
2,000-4,0(X) bp are preferred. For a review of this technique, see Verma et al., 
**Human Chromosomes: a Manual of Basic Techniques," Pergamon Press, New York 
(1988). 

For chromosome mapping, the polynucleotides can be used individually (to 
35 mark a single chromosome or a single site on that chromosome) or in panels (for 
marking multiple sites and/or multiple chromosomes). Preferred polynucleotides 
correspond to the noncoding regions of the cDNAs because the coding sequences are 
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more likely conserved within gene families, thus increasing the chance of cross 
hybridization during chromosomal mapping. 

Once a polynucleotide has been mapped to a precise chromosomal location, the 
physical position of the polynucleotide can be used in linkage analysis. Linkage 
5 analysis establishes coinheritance between a chromosomal location and presentation of a 
particular disease. (Disease mapping data are found, for example, in V. McKusick, 
Mendelian Inheritance in Man (available on line through Johns Hopkins University 
Welch Medical Library) .) Assuming 1 megabase mapping resolution and one gene per 
20 kb, a cDNA precisely localized to a chromosomal region associated with the disease 

1 0 could be one of 50-500 potential causative genes. 

Thus, once coinheritance is established, differences in the polynucleotide and 
the corresponding gene between affected and unaffected individuals can be examined. 
First, visible structural alterations in the chromosomes, such as deletions or 
translocations, are examined in chromosome spreads or by PCR. If no structural 

15 alterations exist, the presence of point mutations are ascenained. Mutations observed in 
some or all affected individuals, but not in normal individuals, indicates that the 
mutation may cause the disease. However, complete sequencing of the polypeptide and 
the corresponding gene from several normal individuals is required to distinguish the 
mutation from a polymorphism. If a new polymorphism is identified, this polymorphic 

20 polypeptide can be used for further linkage analysis. 

Furthermore, increased or decreased expression of the gene in affected 
individuals as compared to unaffected individuals can be assessed using 
polynucleotides of the present invention. Any of these alterations (altered expression, 
chromosomal rearrangement, or mutation) can be used as a diagnostic or prognostic 

25 marker. 

In addition to the foregoing, a polynucleotide can be used to control gene 
expression through triple helix formation or antisense DNA or RNA. Both methods 
rely on binding of the polynucleotide to DNA or RNA. For these techniques, preferred 
polynucleotides are usually 20 to 40 bases in length and complementary to either the 

30 region of the gene involved in transcription (triple helix - see Lee et al., Nucl. Acids 
Res. 6:3073 (1979); Cooney et al., Science 241:456 (1988); and Dervan et al.. Science 
251:1360 (1991) ) or to the mRNA itself (antisense - Okano, J. Neurochem. 56:560 
(1991); Oligodeoxy-nucleotides as Antisense Inhibitors of Gene Expression, CRC 
Press, Boca Raton, FL (1988).) Triple helix formation optimally results in a shut-off 

35 of RNA transcription from DNA, while antisense RNA hybridization blocks u-anslation 
of an mRNA molecule into polypeptide. Both techniques are effective in model 
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systems, and the information disclosed herein can be used to design antisense or triple 
helix polynucleotides in an effort to treat disease. 

Polynucleotides of the present invention are also useful in gene therapy. One 
goal of gene therapy is to insert a normal gene into an organism having a defective 
5 gene, in an effort to correct the genetic defect. The polynucleotides disclosed in the 
present invention offer a means of targeting such genetic defects in a highly accurate 
manner. Another goal is to insert a new gene that was not present in the host genome, 
thereby producing a new trait in the host cell. 

The polynucleotides are also useful for identifying individuals from minute 

10 biological samples. The United States military, for example, is considering the use of 
restriction fragment length polymorphism (RFLP) for identification of its personnel. In 
this technique, an individual's genomic DNA is digested with one or more restriction 
enzymes, and probed on a Southern blot to yield unique bands for identifying 
personnel. This method does not suffer from the current limitations of "Dog Tags" 

15 which can be lost, switched, or stolen, making positive identification difficult. The 
polynucleotides of the present invention can be used as additional DNA markers for 
RFLP. 

The polynucleotides of the present invention can also be used as an alternative to 
RFLP, by determining the actual base-by-base DNA sequence of selected portions of an 

20 individual's genome. These sequences can be used to prepare PCR primers for 

amplifying and isolating such selected DNA, which can then be sequenced. Using this 
technique, individuals can be identified because each individual will have a unique set 
of DNA sequences. Once an unique ID database is established for an individual, 
positive identification of that individual, living or dead, can be made from extremely 

25 small tissue samples. 

Forensic biology also benefits from using DN A-based identification techniques 
as disclosed herein. DNA sequences taken from very small biological samples such as 
tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, semen, etc., can be 
amplified using PCR. In one prior art technique, gene sequences amplified from 

30 polymorphic loci, such as DQa class II HLA gene, are used in forensic biology to 

identify individuals. (Eriich, H., PCR Technology, Freeman and Co. (1992).) Once 
these specific polymorphic loci are amplified, they are digested with one or more 
restriction enzymes, yielding an identifying set of bands on a Southern blot probed with 
DNA corresponding to the DQa class II HLA gene. Similarly, polynucleotides of the 

35 present invention can be used as polymorphic markers for forensic purposes. 

There is also a need for reagents capable of identifying the source of a particular 
tissue. Such need arises, for example, in forensics when presented with tissue of 
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unknown origin. Appropriate reagents can comprise, for example, DNA probes or 
primers specific to particular tissue prepared from the sequences of the present 
invention. Panels of such reagents can identify tissue by species and/or by organ type. 
In a similar fashion, these reagents can be used to screen tissue cultures for 
5 contamination. 

In the very least, the polynucleotides of the present invention can be used as 
molecular weight markers on Southern gels, as diagnostic probes for the presence of a 
specific mRNA in a particular cell type, as a probe to "subtract-out" known sequences 
in the process of discovering novel polynucleotides, for selecting and making oligomers 
10 for attachment to a "gene chip" or other support, to raise anti-DNA antibodies using 
DNA immunization techniques, and as an antigen to elicit an immune response. 

Uses of the Polypeptides 

Each of the polypeptides identified herein can be used in numerous ways. The 
15 following description should be considered exemplary and utilizes known techniques. 

A polypeptide of the present invention can be used to assay protein levels in a 
biological sample using antibody-based techniques. For example, protein expression in 
tissues can be studied with classical immunohistological methods. (Jalkanen, M., et 
al., J. Cell. Biol. 101:976-985 (1985); Jalkanen, M., et al., J. Cell . Biol. 105:3087- 
20 3096 (1987).) Other antibody-based methods useful for detecting protein gene 

expression include immunoassays, such as the enzyme linked immunosorbent assay 
(ELISA) and the radioimmunoassay (RIA). Suitable antibody assay labels are known 
in the art and include enzyme labels, such as, glucose oxidase, and radioisotopes, such 
as iodine (1251, 1211), carbon (14C), sulfur (35S), tritium (3H), indium (1 12In), and 
25 technetium (99mTc), and fluorescent labels, such as fluorescein and rhodamine, and 
biotin. 

In addition to assaying secreted protein levels in a biological sample, proteins 
can also be detected in vivo by imaging. Antibody labels or markers for in vivo 
imaging of protein include those detectable by X-radiography, NMR or ESR. For X- 
30 radiography, suitable labels include radioisotopes such as barium or cesium, which emit 
detectable radiation but are not overtly harmful to the subject. Suitable markers for 
NMR and ESR include those with a detectable characteristic spin, such as deuterium, 
which may be incorporated into the antibody by labeling of nutrients for the relevant 
hybridoma. 

35 A protein-specific antibody or antibody fragment which has been labeled with 

an appropriate detectable imaging moiety, such as a radioisotope (for example, 1311, 
1 12In, 99mTc), a radio-opaque substance, or a material detectable by nuclear magnetic 
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resonance, is introduced (for example, parenterally, subcutaneously, or 
intraperitoneally) into the mammal. It will be understood in the art that the size of the 
subject and the imaging system used will determine the quantity of imaging moiety 
needed to produce diagnostic images. In the case of a radioisotope moiety, for a human 
5 subject, the quantity of radioactivity injected will normally range from about 5 to 20 
millicuries of 99mTc. The labeled antibody or antibody fragment will then 
preferentially accumulate at the location of cells which contain the specific protein. In 
vivo tumor imaging is described in S.W. Burchiel et al., "Immunopharmacokinetics of 
Radiolabeled Antibodies and Their Fragments." (Chapter 13 in Tumor Imaging: The 

10 Radiochemical Detection of Cancer, S.W. Burchiel and B. A. Rhodes, eds., Masson 
Publishing Inc. (1982).) 

Thus, the invention provides a diagnostic method of a disorder, which involves 
(a) assaying the expression of a polypeptide of the present invention in cells or body 
fluid of an individual; (b) comparing the level of gene expression with a standard gene 

15 expression level, whereby an increase or decrease in the assayed polypeptide gene 
expression level compared to the standard expression level is indicative of a disorder. 

Moreover, polypeptides of the present invention can be used to treat disease. 
For example, patients can be administered a polypeptide of the present invention in an 
effort to replace absent or decreased levels of the polypeptide (e.g., insulin), to 

20 supplement absent or decreased levels of a different polypeptide (e.g., hemoglobin S 
for hemoglobin B), to inhibit the activity of a polypeptide (e.g., an oncogene), to 
activate the activity of a polypeptide (e.g., by binding to a receptor), to reduce the 
activity of a membrane bound receptor by competing with it for free ligand (e.g., 
soluble TNF receptors used in reducing inflammation), or to bring about a desired 

25 response (e.g., blood vessel growth). 

Similarly, antibodies directed to a polypeptide of the present invention can also 
be used to treat disease. For example, administration of an antibody directed to a 
polypeptide of the present invention can bind and reduce overproduction of the 
polypeptide. Similarly, administration of an antibody can activate the polypeptide, such 

30 as by binding to a polypeptide bound to a membrane (receptor). 

At the very least, the polypeptides of the present invention can be used as 
molecular weight markers on SDS-PAGE gels or on molecular sieve gel filtration 
columns using methods well known to those of skill in the art. Polypeptides can also 
be used to raise antibodies, which in turn are used to measure protein expression from a 

35 recombinant cell, as a way of assessing transformation of the host cell. Moreover, the 
polypeptides of the present invention can be used to test the following biological 
activities. 
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Biological Activities 

The polynucleotides and polypeptides of the present invention can be used in 
assays to test for one or more biological activities. If these polynucleotides and 
polypeptides do exhibit activity in a particular assay, it is likely that these molecules 
may be involved in the diseases associated with the biological activity. Thus, the 
polynucleotides and polypeptides could be used to treat the associated disease. 

Immune Activity 

A polypeptide or polynucleotide of the present invention may be useful in 
treating deficiencies or disorders of the immune system, by activating or inhibiting the 
proliferation, differentiation, or mobilization (chemotaxis) of immune cells. Immune 
cells develop through a process called hematopoiesis, producing myeloid (platelets, red 
blood cells, neutrophils, and macrophages) and lymphoid (B and T lymphocytes) cells 
from pluripotent stem cells. The etiology of these immune deficiencies or disorders 
may be genetic, somatic, such as cancer or some autoimmune disorders, acquired (e.g., 
by chemotherapy or toxins), or infectious. Moreover, a polynucleotide or polypeptide 
of the present invention can be used as a marker or detector of a particular immune 
system disease or disorder. 

A polynucleotide or polypeptide of the present invention may be useful in 
treating or detecting deficiencies or disorders of hematopoietic cells. A polypeptide or 
polynucleotide of the present invention could be used to increase differentiation and 
proliferation of hematopoietic cells, including the pluripotent stem cells, in an effort to 
treat those disorders associated with a decrease in certain (or many) types hematopoietic 
cells. Examples of inmiunologic deficiency syndromes include, but are not limited to: 
blood protein disorders (e.g. agammaglobulinemia, dysgammaglobulinemia), ataxia 
telangiectasia, common variable immunodeficiency, Digeorge Syndrome, HIV 
infection, HTLV-BLV infection, leukocyte adhesion deficiency syndrome, 
lymphopenia, phagocyte bactericidal dysfunction, severe combined immunodeficiency 
(SCIDs), Wiskott-Aldrich Disorder, anemia, thrombocytopenia, or hemoglobinuria. 

Moreover, a polypeptide or polynucleotide of the present invention could also 
be used to modulate hemostatic (the stopping of bleeding) or thrombolytic activity (clot 
formation). For example, by increasing hemostatic or thrombolytic activity, a 
polynucleotide or polypeptide of the present invention could be used to treat blood 
coagulation disorders (e.g., afibrinogenemia, factor deficiencies), blood platelet 
disorders (e.g. thrombocytopenia), or wounds resulting from trauma, surgery, or other 
causes. Alternatively, a polynucleotide or polypeptide of the present invention that can 
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decrease hemostatic or thrombolytic activity could be used to inhibit or dissolve 
clotting. These molecules could be important in the treatment of heart attacks 
(infarction), strokes, or scarring. 

A polynucleotide or polypeptide of the present invention may also be useful in 

5 treating or detecting autoimmune disorders. Many autoimmune disorders result from 
inappropriate recognition of self as foreign material by immune cells. This 
inappropriate recognition results in an inmiune response leading to the destruction of the 
host tissue. Therefore, the administration of a polypeptide or polynucleotide of the 
present invention that inhibits an immune response, particularly the proliferation, 

10 differentiation, or chemotaxis of T-cells, may be an effective therapy in preventing 
autoimmune disorders. 

Examples of autoimmune disorders that can be treated or detected by the present 
invention include, but are not limited to: Addison's Disease, hemolytic anemia, 
antiphospholipid syndrome, rheumatoid arthritis, dermatitis, allergic encephalomyelitis, 

15 glomerulonephritis, Goodpasture's Syndrome, Graves' Disease, Multiple Sclerosis, 
Myasthenia Gravis, Neuritis, Ophthalmia, Bullous Pemphigoid, Pemphigus, 
Polyendocrinopathies, Purpura, Reiter's Disease, Stiff-Man Syndrome, Autoimmune 
Thyroiditis, Systemic Lupus Erythematosus, Autoimmune Pulmonary Inflammation, 
Guillain-Barre Syndrome, insulin dependent diabetes mellitis, and autoimmune 

20 inflammatory eye disease. 

Similarly, allergic reactions and conditions, such as asthma (particularly allergic 
asthma) or other respiratory problems, may also be treated by a polypeptide or 
polynucleotide of the present invention. Moreover, these molecules can be used to treat 
anaphylaxis, hypersensitivity to an antigenic molecule, or blood group incompatibility. 

25 A polynucleotide or polypeptide of the present invention may also be used to 

treat and/or prevent organ rejection or graft- versus-host disease (GVHD). Organ 
rejection occurs by host immune cell destruction of the transplanted tissue through an 
inunune response. Similarly, an immune response is also involved in GVHD, but, in 
this case, the foreign transplanted immune cells destroy the host tissues. The 

30 administration of a polypeptide or polynucleotide of the present invention that inhibits 
an immune response, particularly the proliferation, differentiation, or chemotaxis of T- 
cells, may be an effective therapy in preventing organ rejection or GVHD. 

Similarly, a polypeptide or polynucleotide of the present invention may also be 
used to modulate inflammation. For example, the polypeptide or polynucleotide may 

35 inhibit the prohferation and differentiation of cells involved in an inflammatory 

response. These molecules can be used to treat inflammatory conditions, both chronic 
and acute conditions, including inflanimation associated with infection (e.g., septic 
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shock, sepsis, or systemic inflainmatory response syndrome (SIRS)), ischemia- 
reperfusion injury, endotoxin lethality, arthritis, complement-mediated hyperacute 
rejection, nephritis, cytokine or chemokine induced lung injury, inflammatory bowel 
disease, Crohn's disease, or resulting from over production of cytokines (e.g., TNF oi 
IL-l.) 

Hvperproliferative Disorders 

A polypeptide or polynucleotide can be used to treat or detect hyperproliferative 
disorders, including neoplasms. A polypeptide or polynucleotide of the present 
invention may inhibit the proliferation of the disorder through direct or indirect 
interactions. Alternatively, a polypeptide or polynucleotide of the present invention 
may proliferate other cells which can inhibit the hyperproliferative disorder. 

For example, by increasing an immune response, particularly increasing 
antigenic qualities of the hyperproUferative disorder or by proliferating, differentiating, 
or mobilizing T-cells, hyperproliferative disorders can be treated. This immune 
response may be increased by either enhancing an existing immune response, or by 
initiating a new immune response. Alternatively, decreasing an immune response may 
also be a method of treating hyperproliferative disorders, such as a chemotherapeutic 
agent. 

Examples of hyperproliferative disorders that can be treated or detected by a 
polynucleotide or polypeptide of the present invention include, but are not limited to 
neoplasms located in the: abdomen, bone, breast, digestive system, liver, pancreas, 
peritoneum, endocrine glands (adrenal, parathyroid, pituitary, testicles, ovary, thymus, 
thyroid), eye, head and neck, nervous (central and peripheral), lymphatic system, 
pelvic, skin, soft tissue, spleen, thoracic, and urogenital. 

Similarly, other hyperproliferative disorders can also be treated or detected by a 
polynucleotide or polypeptide of the present invention. Examples of such 
hyperproliferative disorders include, but are not limited to: hypergammaglobulinemia, 
lymphoproliferative disorders, paraproteinemias, purpura, sarcoidosis, Sezary 
Syndrome, Waldenstron's Macroglobulinemia, Gaucher's Disease, histiocytosis, and 
any other hyperproliferative disease, besides neoplasia, located in an organ system 
listed above. 

Infectious Disease 

A polypeptide or polynucleotide of the present invention can be used to treat or 
detect infectious agents. For example, by increasing the immune response, particularly 
increasing the proliferation and differentiation of B and/or T cells, infectious diseases 
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may be treated. The immune response may be increased by either enhancing an existing 
immune response, or by initiating a new immune response. Alternatively, the 
polypeptide or polynucleotide of the present invention may also directly inhibit the 
infectious agent, without necessarily eliciting an inmiune response. 
5 Viruses are one example of an infectious agent that can cause disease or 

symptoms that can be treated or detected by a polynucleotide or polypeptide of the 
present invention. Examples of viruses, include, but are not limited to the following 
DNA and RNA viral families: Arbovirus, Adenoviridae, Arenaviridae, Arterivirus, 
Bimaviridae, Bunyaviridae, Caliciviridae, Circoviridae, Coronaviridae, Flaviviridae, 

10 Hepadnaviridae (Hepatitis), Herpesviridae (such as, Cytomegalovirus, Herpes 
Simplex, Herpes Zoster), Mononegavirus (e.g., Paramyxoviridae, Morbillivirus, 
Rhabdoviridae), Orthomyxoviridae (e.g.. Influenza), Papovaviridae, Parvoviridae, 
Picomaviridae, Poxviridae (such as Smallpox or Vaccinia), Reoviridae (e.g.. 
Rotavirus), Retroviridae (HTLV-I, HTLV-II, Lentivirus), and Togaviridae (e.g., 

15 Rubivirus). Viruses falling within these families can cause a variety of diseases or 
symptoms, including, but not limited to: arthritis, bronchiollitis, encephalitis, eye 
infections (e.g., conjunctivitis, keratitis), chronic fatigue syndrome, hepatitis (A, B, C, 
E. Chronic Active, Delta), meningitis, opportunistic infections (e.g., AIDS), 
pneumonia, Burkitt's Lymphoma, chickenpox , hemorrhagic fever, Measles, Mumps, 

20 Parainfluenza, Rabies, the common cold, Polio, leukemia. Rubella, sexually 

transmitted diseases, skin diseases (e.g., Kaposi's, warts), and viremia. A polypeptide 
or polynucleotide of the present invention can be used to treat or detect any of these 
symptoms or diseases. 

Similarly, bacterial or fungal agents that can cause disease or symptoms and that 

25 can be treated or detected by a polynucleotide or polypeptide of the present invention 
include, but not limited to, the following Gram-Negative and Gram-positive bacterial 
families and fungi: Actinomycetales (e.g., Corynebacterium, Mycobacterium, 
Norcardia), Aspergillosis, Bacillaceae (e.g., Anthrax, Clostridium), Bacteroidaceae, 
Blastomycosis, Bordetella, Borrelia, Brucellosis, Candidiasis, Campylobacter, 

30 Coccidioidomycosis, Cryptococcosis, Dermatocycoses, Enterobacteriaceae (Klebsiella, 
Salmonella, Serratia, Yersinia), Erysipelothrix, Helicobacter, Legionellosis, 
Leptospirosis, Listeria, Mycoplasmatales, Neisseriaceae (e.g., Acinetobacter, 
Gonorrhea, Menigococcal), Pasteurellacea Infections (e.g., Actinobacillus, 
Heamophilus, Pasteurella), Pseudomonas, Rickettsiaceae, Chlamydiaceae, Syphilis, 

35 and Staphylococcal. These bacterial or fungal families can cause the following diseases 
or symptoms, including, but not limited to: bacteremia, endocarditis, eye infections 
(conjunctivitis, tuberculosis, uveitis), gingivitis, opportunistic infections (e.g., AIDS 
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related infections), paronychia, prosthesis-related infections, Reiter's Disease, 
respiratory tract infections, such as Whooping Cough or Empyema, sepsis, Lyme 
Disease, Cat-Scratch Disease, Dysentery, Paratyphoid Fever, food poisoning, 
Typhoid, pneumonia, Gonorrhea, meningitis, Chlamydia, Syphilis, Diphtheria, 
5 Leprosy, Paratuberculosis, Tuberculosis, Lupus, Botulism, gangrene, tetanus, 

impetigo, Rheumatic Fever, Scarlet Fever, sexually transmitted diseases, skin diseases 
(e.g., cellulitis, dermatocy coses), toxemia, urinary tract infections, wound infections. 
A polypeptide or polynucleotide of the present invention can be used to treat or detect 
any of these symptoms or diseases. 

10 Moreover, parasitic agents causing disease or symptoms that can be treated or 

detected by a polynucleotide or polypeptide of the present invention include, but not 
limited to, the following families: Amebiasis, Babesiosis, Coccidiosis, 
Crypiosporidiosis, Dientamoebiasis, Dourine, Ectoparasitic, Giardiasis, Helminthiasis, 
Leishmaniasis, Theileriasis, Toxoplasmosis, Trypanosomiasis, and Trichomonas. 

15 These parasites can cause a variety of diseases or symptoms, including, but not limited 
to: Scabies, Trombiculiasis, eye infections, intestinal disease (e.g., dysentery, 
giardiasis), liver disease, lung disease, opportunistic infections (e.g., AIDS related). 
Malaria, pregnancy complications, and toxoplasmosis. A polypeptide or polynucleotide 
of the present invention can be used to treat or detect any of these symptoms or 

20 diseases. 

Preferably, treatment using a polypeptide or polynucleotide of the present 
invention could either be by administering an effective amount of a polypeptide to the 
patient, or by removing cells from the patient, supplying the cells with a polynucleotide 
of the present invention, and returning the engineered cells to the patient (ex vivo 
25 therapy). Moreover, the polypeptide or polynucleotide of the present invention can be 
used as an antigen in a vaccine to raise an immune response against infectious disease. 

Regeneration 

A polynucleotide or polypeptide of the present invention can be used to 
30 differentiate, proliferate, and attract cells, leading to the regeneration of tissues. (See, 
Science 276:59-87 (1997).) The regeneration of tissues could be used to repair, 
replace, or protect tissue damaged by congenital defects, trauma (wounds, bums, 
incisions, or ulcers), age, disease (e.g. osteoporosis, osteocarthritis, periodontal 
disease, liver failure), surgery, including cosmetic plastic surgery, fibrosis, reperfusion 
35 injury, or systemic cytokine damage. 

Tissues that could be regenerated using the present invention include organs 
(e.g., pancreas, liver, intestine, kidney, skin, endothelium), muscle (smooth, skeletal 
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or cardiac), vascular (including vascular endothelium), nervous, hematopoietic, and 
skeletal (bone, cartilage, tendon, and ligament) tissue. Preferably, regeneration occurs 
without or decreased scarring. Regeneration also may include angiogenesis. 

Moreover, a polynucleotide or polypeptide of the present invention may increase 
5 regeneration of tissues difficult to heal. For example, increased tendon/ligament 
regeneradon would quicken recovery time after damage. A polynucleotide or 
polypeptide of the present invention could also be used prophylactically in an effort to 
avoid damage. Specific diseases that could be treated include of tendinitis, carpal tunnel 
syndrome, and other tendon or ligament defects. A further example of tissue 

10 regeneration of non-healing wounds includes pressure ulcers, ulcers associated with 
vascular insufficiency, surgical, and traumatic wounds. 

Similarly, nerve and brain tissue could also be regenerated by using a 
polynucleotide or polypeptide of the present invention to proliferate and differentiate 
nerve cells. Diseases that could be treated using this method include central and 

15 peripheral nervous system diseases, neuropathies, or mechanical and traumatic 
disorders (e.g., spinal cord disorders, head trauma, cerebrovascular disease, and 
stoke). Specifically, diseases associated with peripheral nerve injuries, peripheral 
neuropathy (e.g., resulting from chemotherapy or other medical therapies), localized 
neuropathies, and central nervous system diseases (e.g., Alzheimer's disease, 

20 Parkinson's disease, Huntington's disease, amyotrophic lateral sclerosis, and Shy- 
Drager syndrome), could all be treated using the polynucleotide or polypeptide of the 
present invention. 

Chemotaxis 

25 A polynucleodde or polypeptide of the present invention may have chemotaxis 

activity. A chemotaxic molecule attracts or mobilizes cells (e.g., monocytes, 
fibroblasts, neutrophils, T-cells, mast cells, eosinophils, epithelial and/or endothelial 
cells) to a particular site in the body, such as inflammation, infection, or site of 
hyperproliferation. The mobilized cells can then fight off and/or heal the particular 

30 trauma or abnormality. 

A polynucleotide or polypeptide of the present invention may increase 
chemotaxic activity of particular cells. These chemotactic molecules can then be used to 
treat inflammation, infection, hyperproliferative disorders, or any immune system 
disorder by increasing the number of cells targeted to a particular location in the body. 

35 For example, chemotaxic molecules can be used to treat wounds and other trauma to 
tissues by attracting immune cells to the injured location. Chemotactic molecules of the 
present invention can also attract fibroblasts, which can be used to treat wounds. 
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It is also contemplated that a polynucleotide or polypeptide of the present 
invention may inhibit chemotactic activity. These molecules could also be used to treat 
disorders. Thus, a polynucleotide or polypeptide of the present invention could be used 
as an inhibitor of chemotaxis. 

5 

Binding Activity 

A polypeptide of the present invention may be used to screen for molecules that 
bind to the polypeptide or for molecules to which the polypeptide binds. The binding 
of the polypeptide and the molecule may activate (agonist), increase, inhibit 
10 (antagonist), or decrease activity of the polypeptide or the molecule bound. Examples 
of such molecules include antibodies, oligonucleotides, proteins (e.g., receptors),or 
small molecules. 

Preferably, the molecule is closely related to the natural ligand of the 
polypeptide, e.g., a fragment of the ligand, or a natural substrate, a ligand, a structural 

15 or functional mimetic. (See, Coligan et al.. Current Protocols in Immunology 

l(2):Chapter 5 (1991).) Similarly, the molecule can be closely related to the natural 
receptor to which the polypeptide binds, or at least, a fragment of the receptor capable 
of being bound by the polypeptide (e.g., active site). In either case, the molecule can 
be rationally designed using known techniques. 

20 Preferably, the screening for these molecules involves producing appropriate 

cells which express the polypeptide, either as a secreted protein or on the cell 
membrane. Preferred cells include cells from mammals, yeast, Drosophila, or coli. 
Cells expressing the polypeptide (or cell membrane containing the expressed 
polypeptide) are then preferably contacted with a test compound potentially containing 

25 the molecule to observe binding, stimulation, or inhibition of activity of either the 
polypeptide or the molecule. 

The assay may simply test binding of a candidate compound to the polypeptide, 
wherein binding is detected by a label, or in an assay involving competition with a 
labeled competitor. Further, the assay may test whether the candidate compound results 

30 in a signal generated by binding to the polypeptide. 

Alternatively, the assay can be carried out using cell-free preparations, 
polypeptide/molecule affixed to a solid support, chemical libraries, or natural product 
mixtures. The assay may also simply comprise the steps of mixing a candidate 
compound with a solution containing a polypeptide, measuring polypeptide/molecule 

35 activity or binding, and comparing the polypeptide/molecule activity or binding to a 
standard. 
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Preferably, an ELISA assay can measure polypeptide level or activity in a 
sample (e,g., biological sample) using a monoclonal or polyclonal antibody. The 
antibody can measure polypeptide level or activity by either binding, directly or 
indirectly, to the polypeptide or by competing with the polypeptide for a substrate. 
5 All of these above assays can be used as diagnostic or prognostic markers. The 

molecules discovered using these assays can be used to treat disease or to bring about a 
particular result in a patient (e.g., blood vessel growth) by activating or inhibiting the 
polypeptide/molecule. Moreover, the assays can discover agents which may inhibit or 
enhance the production of the polypeptide from suitably manipulated cells or tissues. 

10 Therefore, the invention includes a method of identifying compounds which 

bind to a polypeptide of the invention comprising the steps of: (a) incubating a 
candidate binding compound with a polypeptide of the invention; and (b) determining if 
binding has occurred. Moreover, the invention includes a method of identifying 
agonists/antagonists comprising the steps of: (a) incubating a candidate compound with 

15 a polypeptide of the invention, (b) assaying a biological activity , and (b) determining if 
a biological activity of the polypeptide has been altered. 

Other Activities 

A polypeptide or polynucleotide of the present invention may also increase or 
20 decrease the differentiation or proliferation of embryonic stem cells, besides, as 

discussed above, hematopoietic lineage. 

A polypeptide or polynucleotide of the present invention may also be used to 

modulate mammalian characteristics, such as body height, weight, hair color, eye color, 

skin, percentage of adipose tissue, pigmentation, size, and shape (e.g., cosmetic 
25 surgery). Similarly, a polypeptide or polynucleotide of the present invention may be 

used to modulate manunalian metabolism affecting catabolism, anabolism, processing, 

utiUzation, and storage of energy. 

A polypeptide or polynucleotide of the present invention may be used to change 

a mammal's mental state or physical state by influencing biorhythms, caricadic 
30 rhythms, depression (including depressive disorders), tendency for violence, tolerance 

for pain, reproductive capabilities (preferably by Activin or Inhibin-like activity), 

hormonal or endocrine levels, appetite, libido, memory, stress, or other cognitive 

qualities. 

A polypeptide or polynucleotide of the present invention may also be used as a 
35 food additive or preservative, such as to increase or decrease storage capabilities, fat 
content, lipid, protein, carbohydrate, vitamins, minerals, cofactors or other nutritional 
components. 
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Other Preferred Emhodiments 

Other preferred embodiments of the claimed invention include an isolated 
nucleic acid molecule comprising a nucleotide sequence which is at least 95% identical 
5 to a sequence of at least about 50 contiguous nucleotides in the nucleotide sequence of 
SEQ ID NO:X wherein X is any integer as defined in Table 1. 

Also preferred is a nucleic acid molecule wherein said sequence of contiguous 
nucleotides is included in the nucleotide sequence of SEQ ID NO:X in the range of 
positions beginning with the nucleotide at about the position of the 5' Nucleotide of the 
10 Clone Sequence and ending with the nucleotide at about the position of the 3' 
Nucleotide of the Clone Sequence as defined for SEQ ID NO:X in Table 1. 

Also preferred is a nucleic acid molecule wherein said sequence of contiguous 
nucleotides is included in the nucleotide sequence of SEQ ID NO:X in the range of 
positions beginning with the nucleotide at about the position of the 5' Nucleotide of the 
15 Start Codon and ending with the nucleotide at about the position of the 3' Nucleotide of 
the Clone Sequence as defined for SEQ ID NO:X in Table 1 . 

Similarly preferred is a nucleic acid molecule wherein said sequence of 
contiguous nucleotides is included in the nucleotide sequence of SEQ ID NO:X in the 
range of positions beginning with the nucleotide at about the position of the 5' 
20 Nucleotide of the First Amino Acid of the Signal Peptide and ending with the nucleotide 
at about the position of the 3' Nucleotide of the Clone Sequence as defined for SEQ ID 
NO:X in Table 1. 

Also preferred is an isolated nucleic acid molecyle comprising a nucleotide 
sequence which is at least 95% identical to a sequence of at least about 150 contiguous 
25 nucleotides in the nucleotide sequence of SEQ ID NO:X. 

Further preferred is an isolated nucleic acid molecule comprising a nucleotide 
sequence which is at least 95% identical to a sequence of at least about 500 contiguous 
nucleotides in the nucleotide sequence of SEQ ID NO:X. 

A further preferred embodiment is a nucleic acid molecule comprising a 
30 nucleotide sequence which is at least 95% identical to the nucleotide sequence of SEQ 
ID NO:X beginning with the nucleotide at about the position of the 5' Nucleotide of the 
First Amino Acid of the Signal Peptide and ending with the nucleotide at about the 
position of the 3' Nucleotide of the Clone Sequence as defined for SEQ ID NO:X in 
Table 1. 
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A further preferred embodiment is an isolated nucleic acid molecule comprising 
a nucleotide sequence which is at least 95% identical to the complete nucleotide 
sequence of SEQ ID NO:X. 

Also preferred is an isolated nucleic acid molecule which hybridizes under 
5 stringent hybridization conditions to a nucleic acid molecule, wherein said nucleic acid 
molecule which hybridizes does not hybridize under stringent hybridization conditions 
to a nucleic acid molecule having a nucleotide sequence consisting of only A residues or 
of only T residues. 

Also preferred is a composition of matter comprising a DNA molecule which 
1 0 comprises a human cDNA clone identified by a cDN A Clone Identifier in Table 1 , 
which DNA molecule is contained in the material deposited with the American Type 
Culture Collection and given the ATCC Deposit Number shown in Table 1 for said 
cDNA Clone Identifier. 

Also preferred is an isolated nucleic acid molecule comprising a nucleotide 
1 5 sequence which is at least 95% identical to a sequence of at least 50 contiguous 

nucleotides in the nucleotide sequence of a human cDNA clone identified by a cDNA 
Clone Identifier in Table 1, which DNA molecule is contained in the deposit given the 
ATCC Deposit Number shown in Table 1 . 

Also preferred is an isolated nucleic acid molecule, wherein said sequence of at 
20 least 50 contiguous nucleotides is included in the nucleotide sequence of the complete 
open reading frame sequence encoded by said human cDNA clone. 

Also preferred is an isolated nucleic acid molecule comprising a nucleotide 
sequence which is at least 95% identical to sequence of at least 150 contiguous 
nucleotides in the nucleotide sequence encoded by said human cDNA clone. 
25 A further preferred embodiment is an isolated nucleic acid molecule comprising 

a nucleotide sequence which is at least 95% identical to sequence of at least 500 
contiguous nucleotides in the nucleotide sequence encoded by said human cDNA clone. 

A further preferred embodiment is an isolated nucleic acid molecule comprising 
a nucleotide sequence which is at least 95% identical to the complete nucleotide 
30 sequence encoded by said human cDNA clone. 

A further preferred embodiment is a method for detecting in a biological sample 
a nucleic acid molecule comprising a nucleotide sequence which is at least 95% identical 
to a sequence of at least 50 contiguous nucleotides in a sequence selected from the 
group consisting of: a nucleotide sequence of SEQ ID NO:X wherein X is any integer 
35 as defined in Table 1 ; and a nucleotide sequence encoded by a human cDNA clone 
identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with the 
ATCC Deposit Number shown for said cDNA clone in Table 1; which method 
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comprises a step of comparing a nucleotide sequence of at least one nucleic acid 
molecule in said sample with a sequence selected from said group and determining 
whether the sequence of said nucleic acid molecule in said sample is at least 95% 
identical to said selected sequence. 

Also preferred is the above method wherein said step of comparing sequences 
comprises determining the extent of nucleic acid hybridization between nucleic acid 
molecules in said sample and a nucleic acid molecule comprising said sequence selected 
from said group. Similarly, also preferred is the above method wherein said step of 
comparing sequences is performed by comparing the nucleotide sequence determined 
from a nucleic acid molecule in said sample with said sequence selected from said 
group. The nucleic acid molecules can comprise DNA molecules or RNA molecules. 

A further preferred embodiment is a method for identifying the species, tissue or 
cell type of a biological sample which method comprises a step of detecting nucleic acid 
molel^ules in said sample, if any, comprising a nucleotide sequence that is at least 95% 
identical to a sequence of at least 50 contiguous nucleotides in a sequence selected from 
the group consisting of: a nucleotide sequence of SEQ ID NO:X wherein X is any 
integer as defined in Table 1; and a nucleotide sequence encoded by a human cDNA 
clone identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with 
the ATCC Deposit Number shown for said cDN A clone in Table 1 . 

The method for identifying the species, tissue or cell type of a biological sample 
can comprise a step of detecting nucleic acid molecules comprising a nucleotide 
sequence in a panel of at least two nucleotide sequences, wherein at least one sequence 
in said panel is at least 95% identical to a sequence of at least 50 contiguous nucleotides 
in a sequence selected from said group. 

Also preferred is a method for diagnosing in a subject a pathological condition 
associated with abnormal structure or expression of a gene encoding a secreted protein 
identified in Table 1, which method comprises a step of detecting in a biological sample 
obtained from said subject nucleic acid molecules, if any, comprising a nucleotide 
sequence that is at least 95% identical to a sequence of at least 50 contiguous 
nucleotides in a sequence selected from the group consisting of: a nucleotide sequence 
of SEQ ID NO:X wherein X is any integer as defined in Table 1 ; and a nucleotide 
sequence encoded by a human cDNA clone identified by a cDNA Clone Identifier in 
Table 1 and contained in the deposit with the ATCC Deposit Number shown for said 
cDN A clone in Table 1. 

The method for diagnosing a pathological condition can comprise a step of 
detecting nucleic acid molecules comprising a nucleotide sequence in a panel of at least 
two nucleotide sequences, wherein at least one sequence in said panel is at least 95% 
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identical to a sequence of at least 50 contiguous nucleotides in a sequence selected from 
said group. 

Also preferred is a composition of matter comprising isolated nucleic acid 

molecules wherein the nucleotide sequences of said nucleic acid molecules comprise a 
5 panel of at least two nucleotide sequences, wherein at least one sequence in said panel is 

at least 95% identical to a sequence of at least 50 contiguous nucleotides rn a sequence 

selected from the group consisting of: a nucleotide sequence of SEQ ID NO:X wherein 

X is any integer as defined in Table 1 ; and a nucleotide sequence encoded by a human 

cDNA clone identified by a cDNA Clone Identifier in Table 1 and contained in the 
10 deposit with the ATCC Deposit Number shown for said cDNA clone in Table 1 . The 

nucleic acid molecules can comprise DNA molecules or RNA molecules. 

Also preferred is an isolated polypeptide comprising an annino acid sequence at 

least 90% identical to a sequence of at least about 10 contiguous amino acids in the 

amino acid sequence of SEQ ID NO:Y wherein Y is any integer as defined in Table 1. 
15 Also preferred is a polypeptide, wherein said sequence of contiguous amino 

acids is included in the amino acid sequence of SEQ ID NO:Y in the range of positions 

beginning with the residue at about the position of the First Amino Acid of the Secreted 

Portion and ending with the residue at about the Last Amino Acid of the Open Reading 

Frame as set forth for SEQ ID NO: Y in Table 1 . 
20 Also preferred is an isolated polypeptide comprising an amino acid sequence at 

least 95% identical to a sequence of at least about 30 contiguous amino acids in the 

amino acid sequence of SEQ ID NO: Y. 

Further preferred is an isolated polypeptide comprising an amino acid sequence 

at least 95% identical to a sequence of at least about 100 contiguous amino acids in the 
25 amino acid sequence of SEQ ID NO: Y. 

Further preferred is an isolated polypeptide comprising an amino acid sequence 

at least 95% identical to the complete amino acid sequence of SEQ ID NO: Y. 

Further preferred is an isolated polypeptide comprising an amino acid sequence 

at least 90% identical to a sequence of at least about 10 contiguous amino acids in the 
30 complete amino acid sequence of a secreted protein encoded by a human cDNA clone 

identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with the 

ATCC Deposit Number shown for said cDN A clone in Table 1 . 

Also preferred is a polypeptide wherein said sequence of contiguous amino 

acids is included in the amino acid sequence of a secreted portion of the secreted protein 
35 encoded by a human cDNA clone identified by a cDNA Clone Identifier in Table 1 and 

contained in the deposit with the ATCC Deposit Number shown for said cDNA clone in 

Table 1. 
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Also preferred is an isolated polypeptide comprising an amino acid sequence at 
least 95% identical to a sequence of at least about 30 contiguous amino acids in the 
amino acid sequence of the secreted portion of the protein encoded by a human cDNA 
clone identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with 
5 the ATCC Deposit Number shown for said cDNA clone in Table 1 . 

Also preferred is an isolated polypeptide comprising an amino acid sequence at 
least 95% identical to a sequence of at least about 100 contiguous amino acids in the 
amino acid sequence of the secreted portion of the protein encoded by a human cDNA 
clone identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with 
1 0 the ATCC Deposit Number shown for said cDNA clone in Table 1 . 

Also preferred is an isolated polypeptide comprising an amino acid sequence at 
least 95% identical to the amino acid sequence of the secreted portion of the protein 
encoded by a human cDNA clone identified by a cDNA Clone Identifier in Table 1 and 
contained in the deposit with the ATCC Deposit Number shown for said cDN A clone in 
15 Table 1. 

Further preferred is an isolated antibody which binds specifically to a 
polypeptide comprising an amino acid sequence that is at least 90% identical to a 
sequence of at least 10 contiguous amino acids in a sequence selected from the group 
consisting of: an amino acid sequence of SEQ ID NO: Y wherein Y is any integer as 

20 defined in Table 1 ; and a complete amino acid sequence of a protein encoded by a 

human cDNA clone identified by a cDNA Clone Identifier in Table 1 and contained in 
the deposit with the ATCC Deposit Number shown for said cDNA clone in Table 1. 

Further preferred is a method for detecting in a biological sample a polypeptide 
comprising an amino acid sequence which is at least 90% identical to a sequence of at 

25 least 10 contiguous amino acids in a sequence selected from the group consisting of: an 
amino acid sequence of SEQ ID NO:Y wherein Y is any integer as defined in Table 1; 
and a complete amino acid sequence of a protein encoded by a human cDNA clone 
identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with the 
ATCC Deposit Number shown for said cDNA clone in Table 1 ; which method 

30 comprises a step of comparing an amino acid sequence of at least one polypeptide 
molecule in said sample with a sequence selected from said group and determining 
whether the sequence of said polypeptide molecule in said sample is at least 90% 
identical to said sequence of at least 10 contiguous amino acids. 

Also preferred is the above method wherein said step of comparing an amino 

35 acid sequence of at least one polypeptide molecule in said sample with a sequence 
selected from said group comprises determining the extent of specific binding of 
polypeptides in said sample to an antibody which binds specifically to a polypeptide 



wo 98/54963 



PCT/US98/11422 



222 



comprising an amino acid sequence that is at least 90% identical to a sequence of at least 
10 contiguous amino acids in a sequence selected from the group consisting of: an 
amino acid sequence of SEQ ID NO: Y wherein Y is any integer as defined in Table 1 ; 
and a complete amino acid sequence of a protein encoded by a human cDNA clone 

5 identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with the 
ATCC Deposit Number shown for said cDN A clone in Table 1 . 

Also preferred is the above method wherein said step of comparing sequences is 
performed by comparing the amino acid sequence detemnuned from a polypeptide 
molecule in said sample with said sequence selected from said group. 

10 Also preferred is a method for identifying the species, tissue or cell type of a 

biological sample which method comprises a step of detecting polypeptide molecules in 
said sample, if any, comprising an amino acid sequence that is at least 90% identical to 
a sequence of at least 10 contiguous amino acids in a sequence selected from the group 
consisting of: an amino acid sequence of SEQ ID NO:Y wherein Y is any integer as 

15 defined in Table 1; and a complete amino acid sequence of a secreted protein encoded 
by a human cDNA clone identified by a cDNA Clone Identifier in Table 1 and contained 
in the deposit with the ATCC Deposit Number shown for said cDNA clone in Table L 

Also preferred is the above method for identifying the species, tissue or cell type 
of a biological sample, which method comprises a step of detecting polypeptide 

20 molecules comprising an amino acid sequence in a panel of at least two amino acid 
sequences, wherein at least one sequence in said panel is at least 90% identical to a 
sequence of at least 10 contiguous amino acids in a sequence selected from the above 
group. 

Also preferred is a method for diagnosing in a subject a pathological condition 
25 associated with abnormal structure or expression of a gene encoding a secreted protein 
identified in Table 1, which method comprises a step of detecting in a biological sample 
obtained from said subject polypeptide molecules comprising an amino acid sequence in 
a panel of at least two amino acid sequences, wherein at least one sequence in said panel 
is at least 90% identical to a sequence of at least 10 contiguous amino acids in a 
30 sequence selected from the group consisting of: an amino acid sequence of SEQ ID 
NO:Y wherein Y is any integer as defined in Table I; and a complete amino acid 
sequence of a secreted protein encoded by a human cDNA clone identified by a cDNA 
Clone Identifier in Table 1 and contained in the deposit with the ATCC Deposit Number 
shown for said cDN A clone in Table 1 . 
35 In any of these methods, the step of detecting said polypeptide molecules 

includes using an antibody. 
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Also preferred is an isolated nucleic acid molecule comprising a nucleotide 
sequence which is at least 95% identical to a nucleotide sequence encoding a 
polypeptide wherein said polypeptide comprises an amino acid sequence that is at least 
90% identical to a sequence of at least 10 contiguous amino acids in a sequence selected 
5 from the group consisting of: an amino acid sequence of SEQ ID NO: Y wherein Y is 
any integer as defined in Table 1 ; and a complete amino acid sequence of a secreted 
protein encoded by a human cDNA clone identified by a cDN A Clone Identifier in Table 
1 and contained in the deposit with the ATCC Deposit Number shown for said cDNA 
clone in Table 1 . 

10 Also preferred is an isolated nucleic acid molecule, wherein said nucleotide 

sequence encoding a polypeptide has been optimized for expression of said polypeptide 
in a prokaryotic host. 

Also preferred is an isolated nucleic acid molecule, wherein said polypeptide 
comprises an amino acid sequence selected from the group consisting of: an amino acid 

1 5 sequence of SEQ ID NO: Y wherein Y is any integer as defined in Table 1 ; and a 

complete amino acid sequence of a secreted protein encoded by a human cDNA clone 
identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with the 
ATCC Deposit Number shown for said cDNA clone in Table 1 . 

Further preferred is a method of making a recombinant vector comprising 

20 inserting any of the above isolated nucleic acid molecule into a vector. Also preferred is 
the recombinant vector produced by this method. Also preferred is a method of making 
a recombinant host cell comprising introducing the vector into a host cell, as well as the 
recombinant host cell produced by this method. 

Also preferred is a method of making an isolated polypeptide comprising 

25 culturing this recombinant host cell under conditions such that said polypeptide is 

expressed and recovering said polypeptide. Also preferred is this method of making an 
isolated polypeptide, wherein said recombinant host cell is a eukaryotic cell and said 
polypeptide is a secreted portion of a human secreted protein comprising an amino acid 
sequence selected from the group consisting of: an amino acid sequence of SEQ ID 

30 NO: Y beginning with the residue at the position of the First Amino Acid of the Secreted 
Portion of SEQ ID NO: Y wherein Y is an integer set forth in Table 1 and said position 
of the First Amino Acid of the Secreted Portion of SEQ ID NO:Y is defined in Table 1; 
and an amino acid sequence of a secreted portion of a protein encoded by a human 
cDNA clone identified by a cDNA Clone Identifier in Table 1 and contained in the 

35 deposit with the ATCC Deposit Number shown for said cDNA clone in Table 1 . The 
isolated polypeptide produced by this method is also preferred. 
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Also preferred is a method of treatment of an individual in need of an increased 
level of a secreted protein activity, which method comprises administering to such an 
individual a pharmaceutical composition comprising an amount of an isolated 
polypeptide, polynucleotide, or antibody of the claimed invention effective to increase 
5 the level of said protein activity in said individual. 

Having generally described the invention, the same will be more readily 
understood by reference to the following examples, which are provided by way of 
illustration and are not intended as limiting. 

10 Examples 

Example 1: Isolation of a Selected cDNA Clone From the Deposited 
Sample 

Each cDNA clone in a cited ATCC deposit is contained in a plasmid vector. 
15 Table 1 identifies the vectors used to construct the cDNA library from which each clone 
was isolated. In many cases, the vector used to construct the library is a phage vector 
from which a plasmid has been excised. The table immediately below correlates the 
related plasmid for each phage vector used in constructing the cDNA library. For 
example, where a particular clone is identified in Table 1 as being isolated in the vector 
20 "Lambda Zap," the corresponding deposited clone is in "pBluescript." 

Vector Use d to Construct Library Corresponding Deposited Plasmid 

Lambda Zap pBluescript (pBS) 

Uni-Zap XR pBluescript (pBS) 

Zap Express pBK 
25 lafmidBA plafmidBA 

pSportl pSportl 
pCMVSport 2.0 pCMVSport 2.0 

pCMVSport 3.0 pCMVSport 3.0 

pCR®2.1 pCR®2.1 
30 Vectors Lambda Zap (U.S. Patent Nos. 5,128,256 and 5,286,636), Uni-Zap 

XR (U.S. Patent Nos. 5,128, 256 and 5,286,636), Zap Express (U.S. Patent Nos. 
5,128,256 and 5,286,636), pBluescript (pBS) (Short, J. M. et al., Nucleic Acids Res. 
16:7583-7600 (1988); Alting-Mees, M. A. and Short, J. M., Nucleic Acids Res. 
17:9494 (1989)) andpBK (Alting-Mees, M. A. et al., Strategies 5:58-61 (1992)) are 
35 commercially available from Stratagene Cloning Systems, Inc., 1 101 1 N. Torrey Pines 
Road, La JoUa, CA, 92037. pBS contains an ampicillin resistance gene and pBK 
contains a neomycin resistance gene. Both can be transformed into E. coli strain XL-1 
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Blue, also available from Stratagene. pBS comes in 4 forms SK+, SK-, KS+ and KS. 
The S and K refers to the orientation of the polylinker to the T7 and T3 primer 
sequences which flank the polylinker region ("S" is for Sad and "K" is for Kpnl which 
are the first sites on each respective end of the linker). or refer to the orientation 
5 of the f 1 origin of replication ("ori"), such that in one orientation, single stranded rescue 
initiated from the f 1 ori generates sense strand DNA and in the other, antisense. 

Vectors pSportl, pCMVSport 2.0 and pCMVSport 3.0, were obtained from 
Life Technologies, Inc., P. O. Box 6009, Gaithersburg, MD 20897. All Sport vectors 
contain an ampicillin resistance gene and may be transformed into E. coli strain 

10 DHIOB, also available from Life Technologies. (See, for instance, Gruber, C. E., et 
al.. Focus 15:59 (1993).) Vector lafmid BA (Bento Soares, Columbia University, NY) 
contains an ampicillin resistance gene and can be transformed into E. coli strain XL-1 
Blue. Vector pCR®2.1, which is available from Invitrogen, 1600 Faraday Avenue, 
Carlsbad, CA 92(X)8, contains an ampicillin resistance gene and may be transformed 

15 into E. coli strain DHIOB, available from Life Technologies. (See, for instance, Clark, 
J. M., Nuc. Acids Res. 16:9677-9686 (1988) and Mead, D. et aL, Bio/Technology 9: 
(1991).) Preferably, a polynucleotide of the present invention does not comprise the 
phage vector sequences identified for the particular clone in Table 1, as well as the 
corresponding plasmid vector sequences designated above. 

20 The deposited material in the sample assigned the ATCC Deposit Number cited 

in Table 1 for any given cDNA clone also may contain one or more additional plasmids, 
each comprising a cDNA clone different from that given clone. Thus, deposits sharing 
the same ATCC Deposit Number contain at least a plasmid for each cDNA clone 
identified in Table 1. Typically, each ATCC deposit sample cited in Table 1 comprises 

25 a mixture of approximately equal amounts (by weight) of about 50 plasmid DNAs, each 
containing a different cDNA clone; but such a deposit sample may include plasmids for 
more or less than 50 cDNA clones, up to about 500 cDNA clones. 

Two approaches can be used to isolate a particular clone from the deposited 
sample of plasmid DNAs cited for that clone in Table 1. First, a plasmid is directly 

30 isolated by screening the clones using a polynucleotide probe corresponding to SEQ ID 
NO:X. 

Particularly, a specific polynucleotide with 30-40 nucleotides is synthesized 
using an Applied Biosystems DNA synthesizer according to the sequence reported. 
The oligonucleotide is labeled, for instance, with ^-P-y-ATP using T4 polynucleotide 
35 kinase and purified according to routine methods. (E.g., Maniatis et al.. Molecular 
Cloning: A Laboratory Manual, Cold Spring Harbor Press, Cold Spring, NY (1982).) 
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The plasmid mixture is transformed into a suitable host, as indicated above (such as 
XL-1 Blue (Stratagene)) using techniques known to those of skill in the art, such as 
those provided by the vector supplier or in related publications or patents cited above. 
The transformants are plated on 1.5% agar plates (containing the appropriate selection 
5 agent, e.g., ampicillin) to a density of about 150 transformants (colonies) per plate. 
These plates are screened using Nylon membranes according to routine methods for 
bacterial colony screening (e.g., Sambrook et al., Molecular Cloning: A Laboratory 
Manual, 2nd Edit., (1989), Cold Spring Harbor Laboratory Press, pages 1.93 to 
L104), or other techniques known to those of skill in the art. 

10 Alternatively, two primers of 17-20 nucleotides derived from both ends of the 

SEQ ID NO:X (i.e., within the region of SEQ ID NO:X bounded by the 5' NT and the 
3' NT of the clone defined in Table 1) are synthesized and used to amplify the desired 
cDNA using the deposited cDNA plasmid as a template. The polymerase chain reaction 
is carried out under routine conditions, for instance, in 25 jil of reaction mixture with 

1 5 0.5 ug of the above cDNA template. A convenient reaction mixture is 1 .5-5 mM 

MgCl^, 0.01% (w/v) gelatin, 20 ^M each of dATP, dCTP, dGTP, dTTP, 25 pmol of 
each primer and 0.25 Unit of Taq polymerase. Thirty five cycles of PCR (denaturation 
at 94X for 1 min; annealing at 55°C for 1 min; elongation at 72''C for 1 min) are 

performed with a Perkin-Elmer Cetus automated thermal cycler. The amplified product 

20 is analyzed by agarose gel electrophoresis and the DNA band with expected molecular 
weight is excised and purified. The PCR product is verified to be the selected sequence 
by subcloning and sequencing the DNA product. 

Several methods are available for the identification of the 5' or 3' non-coding 
portions of a gene which may not be present in the deposited clone. These methods 

25 include but are not Umited to, filter probing, clone enrichment using specific probes, 
and protocols similar or identical to 5' and 3' "RACE" protocols which are well known 
in the art. For instance, a method similar to 5' RACE is available for generating the 
missing 5* end of a desired full-length transcript. (Fromont-Racine et al.. Nucleic Acids 
Res. 21(7):1683-1684 (1993).) 

30 Briefly, a specific RNA oligonucleotide is ligated to the 5' ends of a population 

of RNA presumably containing full-length gene RNA transcripts. A primer set 
containing a primer specific to the ligated RNA oligonucleotide and a primer specific to 
a known sequence of the gene of interest is used to PCR amplify the 5' portion of the 
desired full-length gene. This amplified product may then be sequenced and used to 

35 generate the full length gene. 
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This above method starts with total RNA isolated from the desired source, 
although poly-A+ RNA can be used. The RNA preparation can then be treated with 
phosphatase if necessary to eliminate 5' phosphate groups on degraded or damaged 
RNA which may interfere with the later RNA ligase step. The phosphatase should then 
5 be inactivated and the RNA treated with tobacco acid pyrophosphatase in order to 
remove the cap structure present at the 5* ends of messenger RNAs. This reaction 
leaves a 5' phosphate group at the 5' end of the cap cleaved RNA which can then be 
ligated to an RNA oligonucleotide using T4 RNA ligase. 

This modified RNA preparation is used as a template for first strand cDNA 
10 synthesis using a gene specific oligonucleotide. The first strand synthesis reaction is 
used as a template for PGR amplification of the desired 5' end using a primer specific to 
the ligated RNA oligonucleotide and a primer specific to the known sequence of the 
gene of interest. The resultant product is then sequenced and analyzed to confirm that 
the 5' end sequence belongs to the desired gene. 

15 

Example 2: Isolation of Genomic Clones Corresponding to a 
Polynucleotide 

A human genomic PI library (Genomic Systems, Inc.) is screened by PGR 
using primers selected for the cDNA sequence corresponding to SEQ ID NO:X., 
20 according to the method described in Example 1. (See also, Sambrook.) 

Example 3: Tissue Distribution of Polypeptide 

Tissue distribution of mRNA expression of polynucleotides of the present 
invention is determined using protocols for Northern blot analysis, described by, 

25 among others, Sambrook et al. For example, a cDNA probe produced by the method 
described in Example 1 is labeled with P"*^ using the rediprimc™ DNA labeling system 
(Amersham Life Science), according to manufacturer's instructions. After labeling, the 
probe is purified using CHROMA SPIN- 100™ column (Clontech Laboratories, Inc.), 
according to manufacturer's protocol number PT1200-1. The purified labeled probe is 

30 then used to examine various human tissues for mRNA expression. 

Multiple Tissue Northern (MTN) blots containing various human tissues (H) or 
human immune system tissues (IM) (Clontech) are examined with the labeled probe 
using ExpressHyb'TM hybridization solution (Clontech) according to manufacturer's 
protocol number PTl 190-1. Following hybridization and washing, the blots are 

35 mounted and exposed to film at -70°C overnight, and the films developed according to 
standard procedures. 
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Example 4: Chromosomal Ma pping of the Polynucleotides 

An oligonucleotide primer set is designed according to the sequence at the 5' 
end of SEQ ID NO;X. This primer preferably spans about 100 nucleotides. This 
5 primer set is then used in a polymerase chain reaction under the following set of 

conditions : 30 seconds, 95°C; 1 minute, 56°C; 1 minute, 70°C. This cycle is repeated 
32 times followed by one 5 minute cycle at 70°C. Human, mouse, and hamster DNA 
is used as template in addition to a somatic cell hybrid panel containing individual 
chromosomes or chromosome fragments (Bios, Inc). The reactions is analyzed on 
10 either 8% polyacrylamide gels or 3.5 % agarose gels. Chromosome mapping is 

determined by the presence of an approximately 100 bp PGR fragment in the particular 
somatic cell hybrid. 

Example 5: B acterial Expression of a Polypeptide 

1 5 A polynucleotide encoding a polypeptide of the present invention is ampUfied 

using PGR oligonucleotide primers corresponding to the 5' and 3' ends of the DNA 
sequence, as outlined in Example 1, to synthesize insertion fragments. The primers 
used to amplify the cDNA insert should preferably contain restriction sites, such as 
BamHI and Xbal, at the 5' end of the primers in order to clone the amplified product 

20 into the expression vector. For example, BamHI and Xbal correspond to the restriction 
enzyme sites on the bacterial expression vector pQE-9. (Qiagen, Inc., Ghatsworth, 
CA). This plasmid vector encodes antibiotic resistance (Amp^), a bacterial origin of 
replication (ori), an IPTG-regulatable promoter/operator (P/O), a ribosome binding site 
(RBS), a 6-histidine tag (6-His), and restriction enzyme cloning sites. 

25 The pQE-9 vector is digested with BamHI and Xbal and the amplified fragment 

is ligated into the pQE-9 vector maintaining the reading frame initiated at the bacterial 
RBS. The ligation mixture is then used to transform the E. coli strain Ml 5/rep4 
(Qiagen, Inc.) which contains multiple copies of the plasmid pREP4, which expresses 
the lad repressor and also confers kanamycin resistance (KanO. Transformants are 

30 identified by their ability to grow on LB plates and ampicillin/kanamycin resistant 

colonies are selected. Plasmid DNA is isolated and confirmed by restriction analysis. 

Glones containing the desired constructs are grown overnight (O/N) in liquid 
culture in LB media supplemented with both Amp (100 ug/ml) and Kan (25 ug/ml). 
The O/N culture is used to inoculate a large culture at a ratio of 1 : 100 to 1 :250. The 

35 cells are grown to an optical density 600 (O.D.^) of between 0.4 and 0.6. IPTG 
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(Isopropyl-B-D-thiogalacto pyranoside) is then added to a final concentration of 1 mM. 
IPTG induces by inactivating the lad repressor, clearing the P/O leading to increased 
gene expression. 

Cells are grown for an extra 3 to 4 hours. Cells are then harvested by 
5 centrifugation (20 mins at 6000Xg). The cell pellet is solubilized in the chaotropic 
agent 6 Molar Guanidine HCl by stirring for 3-4 hours at 4°C. The cell debris is 
removed by centrifugation, and the supernatant containing the polypeptide is loaded 
onto a nickel-nitrilo-tri-acetic acid ("Ni-NTA") affinity resin column (available from 
QIAGEN, Inc., supra). Proteins with a 6 x His tag bind to the Ni-NTA resin with high 

10 affinity and can be purified in a simple one-step procedure (for details see: The 
QIAexpressionist (1995) QIAGEN, Inc., supra). 

Briefly, the supernatant is loaded onto the column in 6 M guanidine-HCl, pH 8, 
the column is first washed with 10 volumes of 6 M guanidine-HCl. pH 8, then washed 
with 10 volumes of 6 M guanidine-HCl pH 6, and finally the polypeptide is eluted with 

15 6 M guanidine-HCl, pH 5. 

The purified protein is then renatured by dialyzing it against phosphate-buffered 
sahne (PBS) or 50 mM Na-acetate, pH 6 buffer plus 200 mM NaCl. Alternatively, the 
protein can be successfully refolded while immobilized on the Ni-NTA column. The 
recommended conditions are as follows: renature using a linear 6M- IM urea gradient in 

20 500 mM NaCl, 20% glycerol, 20 mM Tris/HCl pH 7.4, containing protease inhibitors. 
The renaturation should be performed over a period of 1.5 hours or more. After 
renaturation the proteins are eluted by the addition of 250 mM immidazole. Immidazole 
is removed by a final dialyzing step against PBS or 50 mM sodium acetate pH 6 buffer 
plus 200 mM NaCl. The purified protein is stored at 4°C or frozen at -SO'^C. 

25 In addition to die above expression vector, the present invention further includes 

an expression vector comprising phage operator and promoter elements operatively 
linked to a polynucleotide of the present invention, called pHE4a. (ATCC Accession 
Number 209645, deposited on February 25, 1998.) This vector contains: 1) a 
neomycinphosphotransferase gene as a selection marker, 2) an E. coli origin of 

30 replication, 3) a T5 phage promoter sequence, 4) two lac operator sequences, 5) a 

Shine-Delgamo sequence, and 6) the lactose operon repressor gene (laclq). The origin 
of replication (oriC) is derived from pUC19 (LTI, Gaithersburg, MD). The promoter 
sequence and operator sequences are made synthetically. 

DNA can be inserted into the pHEa by restricting the vector with Ndel and 

35 Xbal, BamHI, Xhol, or Asp7l8, running the restricted product on a gel, and isolating 
the larger fragment (the stuffer fragment should be about 310 base pairs). The DNA 
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insert is generated according to the PGR protocol described in Example 1, using PGR 
primers having restriction sites for Ndel (5' primer) and Xbal, BamHI, Xhol, or 
Asp718 (3' primer). The PGR insert is gel purified and restricted with compatible 
enzymes. The insert and vector are ligated according to standard protocols. 
5 The engineered vector could easily be substituted in the above protocol to 

express protein in a bacterial system. 

Example 6: Purification of a Polypeptide from an Inclusion Body 

The following alternative method can be used to purify a polypeptide expressed 
10 in £ coli when it is present in the form of inclusion bodies. Unless otherwise specified, 

all of the following steps are conducted at 4-10°C. 

Upon completion of the production phase of the £. coli fermentation, the cell 

culture is cooled to 4-10°C and the cells harvested by continuous centrifugation at 

15,000 rpm (Heraeus Sepatech). On the basis of the expected yield of protein per unit 
15 weight of cell paste and the amount of purified protein required, an appropriate amount 
of cell paste, by weight, is suspended in a buffer solution containing 100 mM Tris, 50 
mM EDTA, pH 7.4. The cells are dispersed to a homogeneous suspension using a 
high shear mixer. 

The cells are then lysed by passing the solution through a microfluidizer 
20 (Microfuidics, Gorp. or APV Gaulin, Inc.) twice at 4000-6000 psi. The homogenate is 
then mixed with NaCl solution to a final concentration of 0.5 M NaGl, followed by 
centrifugation at 7000 xg for 15 min. The resultant pellet is washed again using 0.5M 
NaGl, 100 mM Tris, 50 mM EDTA, pH 7.4. 

The resulting washed inclusion bodies are solubilized with 1.5 M guanidine 
25 hydrochloride (GuHGl) for 2-4 hours. After 7000 xg cenUifugation for 15 min., the 

pellet is discarded and the polypeptide containing supernatant is incubated at 4°G 

overnight to allow further GuHCl extraction. 

Following high speed centrifugation (30,000 xg) to remove insoluble particles, 
the GuHGl solubilized protein is refolded by quickly mixing the GuHGl extract with 20 
30 volumes of buffer containing 50 mM sodium, pH 4.5, 1 50 mM NaGl, 2 mM EDTA by 

vigorous stirring. The refolded diluted protein solution is kept at 4''C without mixing 

for 12 hours prior to further purification steps. 

To clarify the refolded polypeptide solution, a previously prepared tangential 

filtration unit equipped with 0. 16 |Lim membrane filter with appropriate surface area 
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(e.g., Filtron), equilibrated with 40 mM sodium acetate, pH 6.0 is employed. The 
filtered sample is loaded onto a cation exchange resin (e.g., Poros HS-50, Perseptive 
Biosystems). The column is washed with 40 mM sodium acetate, pH 6.0 and eluted 
with 250 mM, 500 mM, 1000 mM, and 1500 mM NaCl in the same buffer, in a 
5 stepwise manner. The absorbance at 280 nm of the effluent is continuously monitored. 
Fractions are collected and further analyzed by SDS-PAGE. 

Fractions containing the polypeptide are then pooled and mixed with 4 volumes 
of water. The diluted sample is then loaded onto a previously prepared set of tandem 
columns of strong anion (Poros HQ-50, Perseptive Biosystems) and weak anion 

1 0 (Poros CM-20, Perseptive Biosystems) exchange resins. The columns are equilibrated 
with 40 mM sodium acetate, pH 6.0. Both columns are washed with 40 mM sodium 
acetate, pH 6.0, 200 mM NaCl. The CM-20 column is then eluted using a 10 column 
volume linear gradient ranging from 0.2 M NaCl, 50 mM sodium acetate, pH 6.0 to 1.0 
M NaCl, 50 mM sodium acetate, pH 6.5. Fractions are collected under constant Ajg^ 

15 monitoring of the effluent. Fractions containing the polypeptide (determined, for 
instance, by 16% SDS-PAGE) are then pooled. 

The resultant polypeptide should exhibit greater than 95% purity after the above 
refolding and purification steps. No major contaminant bands should be observed from 
Commassie blue stained \6% SDS-PAGE gel when 5 ^g of purified protein is loaded. 

20 The purified protein can also be tested for endotoxin/LPS contamination, and typically 
the LPS content is less than 0.1 ng/ml according to LAL assays. 

Example 7; Clo ning and Expression of a Polvpeptide in a Baculovirus 
Expression System 

25 In this example, the plasmid shuttle vector pA2 is used to insert a polynucleotide 

into a baculovirus to express a polypeptide. This expression vector contains the strong 
polyhedrin promoter of the Autographa califomica nuclear polyhedrosis virus 
(AcMNPV) followed by convenient restriction sites such as BamHI, Xba I and 
Asp718. The polyadenylation site of the simian virus 40 ("SV40") is used for efficient 

30 polyadenylation. For easy selection of recombinant virus, the plasmid contains the 

beta-galactosidase gene from E. coli under control of a weak Drosophila promoter in the 
same orientation, followed by the polyadenylation signal of the polyhedrin gene. The 
inserted genes are flanked on both sides by viral sequences for cell-mediated 
homologous recombination with wild-type viral DNA to generate a viable virus that 

35 express the cloned polynucleotide. 
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Many other baculovirus vectors can be used in place of the vector above, such 
as pAc373, pVL941, and pAcIMl, as one skilled in the art would readily appreciate, as 
long as the construct provides appropriately located signals for transcripdon, 
translation, secretion and the like, including a signal peptide and an in-frame AUG as 
5 required. Such vectors are described, for instance, in Luckow et al., Virology 170:31- 
39 (1989). 

Specifically, the cDNA sequence contained in the deposited clone, including the 
AUG initiation codon and the naturally associated leader sequence identified in Table 1, 
is amplified using the PGR protocol described in Example 1. If the naturally occurring 

10 signal sequence is used to produce the secreted protein, the pA2 vector does not need a 
second signal peptide. Alternatively, the vector can be modified (pA2 GP) to include a 
baculovirus leader sequence, using the standard methods described in Summers et al., 
"A Manual of Methods for Baculovirus Vectors and Insect Cell Culture Procedures," 
Texas Agricultural Experimental Station Bulletin No. 1555 (1987). 

15 The amplified fragment is isolated from a 1 % agarose gel using a conunercially 

available kit ("Geneclean," BIO 101 Inc., La JoUa, Ca.). The fragment then is digested 
with appropriate restriction enzymes and again purified on a 1% agarose gel. 

The plasnfiid is digested with the corresponding restriction enzymes and 
optionally, can be dephosphorylated using calf intestinal phosphatase, using routine 

20 procedures known in the art. The DNA is then isolated from a 1% agarose gel using a 
commercially available kit ("Geneclean" BIO 101 Inc., La Jolla, Ca.). 

The fragment and the dephosphorylated plasmid are ligated together with T4 
DNA ligase. E, coli HBlOl or other suitable £. coli hosts such as XL-1 Blue 
(Su-atagene Cloning Systems, La Jolla, CA) cells are transformed with the ligation 

25 mixture and spread on culture plates. Bacteria containing die plasmid are identified by 
digesting DNA from individual colonies and analyzing the digestion product by gel 
electrophoresis. The sequence of the cloned fragment is confirmed by DNA 
sequencing. 

Five ^ig of a plasmid containing the polynucleotide is co-transfected with 1.0 ^g 
30 of a commercially available linearized baculovirus DNA ("BaculoGold™ baculovims 
DNA", Pharmingen, San Diego, CA), using the lipofection method described by 
Feigner et al„ Proc. Natl. Acad. Sci. USA 84:7413-7417 (1987). One fig of 
BaculoGold™ virus DNA and 5 |ag of the plasmid are mixed in a sterile well of a 
microtiter plate containing 50 |j.l of serum-free Grace's medium (Life Technologies 
35 Inc., Gaithersburg, MD). Afterwards, 10 |li1 Lipofectin plus 90 |il Grace's medium are 
added, mixed and incubated for 15 minutes at room temperature. Then the transfection 
mixture is added drop-wise to Sf9 insect cells (ATCC CRL 1711) seeded in a 35 mm 
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tissue culture plate with 1 ml Grace's medium without serum. The plate is then 
incubated for 5 hours at 27° C. The transfection solution is then removed from the plate 
and 1 ml of Grace's insect medium supplemented with 10% fetal calf serum is added. 
Cultivation is then continued at IT C for four days. 

After four days the supernatant is collected and a plaque assay is performed, as 
described by Summers and Smith, supra. An agarose gel with "Blue Gal" (Life 
Technologies Inc., Gaithersburg) is used to allow easy identification and isolation of 
gal-expressing clones, which produce blue-stained plaques. (A detailed description of a 
"plaque assay" of this type can also be found in the user's guide for insect cell culture 
and baculovirology distributed by Life Technologies Inc., Gaithersburg, page 9-10.) 
After appropriate incubation, blue stained plaques are picked with the tip of a 
micropipettor (e.g., EppendorO. The agar containing the recombinant viruses is then 
resuspended in a microcentrifuge tube containing 200 |il of Grace's medium and the 
suspension containing the recombinant baculovirus is used to infect Sf9 cells seeded in 
35 mm dishes. Four days later the supematants of these culture dishes are harvested 
and then they are stored at 4'' C. 

To verify the expression of the polypeptide, Sf9 cells are grown in Grace's 
medium supplemented with 10% heat-inactivated FBS. The cells are infected with the 
recombinant baculovirus containing the polynucleotide at a multiplicity of infection 
("MOI") of about 2. If radiolabeled proteins are desired, 6 hours later the medium is 
removed and is replaced with SF900 II medium minus methionine and cysteine 
(available from Life Technologies Inc., Rockville, MD). After 42 hours, 5 ]iC\ of ^^S- 
methionine and 5 ^Ci ^^S-cysteine (available from Amersham) are added. The cells are 
further incubated for 16 hours and then are harvested by centrifugation. The proteins 
in the supernatant as well as the intracellular proteins are analyzed by SDS-PAGE 
followed by autoradiography (if radiolabeled). 

Microsequencing of the amino acid sequence of the amino terminus of purified 
protein may be used to determine the amino terminal sequence of the produced 
protein. 

Examples: Expression of a Polypeptide in Mammalian Cells 

The polypeptide of the present invention can be expressed in a mammalian cell. 
A typical mammalian expression vector contains a promoter element, which mediates 
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the initiation of transcription of mRNA, a protein coding sequence, and signals required 
for the termination of transcription and polyadenylation of the transcript. Additional 
elements include enhancers, Kozak sequences and intervening sequences flanked by 
donor and acceptor sites for RNA splicing. Highly efficient transcription is achieved 
5 with the early and late promoters from SV40, the long terminal repeats (LTRs) from 
Retroviruses, e.g., RSV, HTLVI, HIVI and the early promoter of the cytomegalovirus 
(CMV). Hov^^ever, cellular elements can also be used (e.g., the human actin promoter). 

Suitable expression vectors for use in practicing the present invention include, 
for example, vectors such as pSVL and pMSG (Pharmacia, Uppsala, Sweden), 
10 pRSVcat (ATCC 37152), pSV2dhfr (ATCC 37146), pBC12MI (ATCC 67109), 
pCMVSport 2.0, and pCMVSport 3.0. Mammalian host cells that could be used 
include, human Hela, 293, H9 and Jurkat cells, mouse NIH3T3 and CI 27 cells, Cos 1, 
Cos 7 and CVl, quail QCl-3 cells, mouse L cells and Chinese hamster ovary (CHO) 
cells. 

1 5 Alternatively, the polypeptide can be expressed in stable cell hnes containing the 

polynucleotide integrated into a chromosome. The co-transfection widi a selectable 
marker such as dhfr, gpt, neomycin, hygromycin allows the identification and isolation 
of the transfected cells. 

The transfected gene can also be amplified to express large amounts of the 

20 encoded protein. The DHFR (dihydrofolate reductase) marker is useful in developing 
cell lines that carry several hundred or even several thousand copies of the gene of 
interest. (See, e.g., Alt, F. W., et al., J. Biol. Chem. 253:1357-1370 (1978); Hamlin, 
J- L. and Ma, C, Biochem. et Biophys. Acta, 1097:107-143 (1990); Page, M. J. and 
Sydenham, M. A., Biotechnology 9:64-68 (1991).) Another useful selection marker is 

25 the enzyme glutamine synthase (GS) (Murphy et al., Biochem J. 227:277-279 (1991); 
Bebbington et aL, Bio/Technology 10:169-175 (1992). Using these markers, the 
mammalian cells are grown in selective medium and the cells with the highest resistance 
are selected. These cell hnes contain the amplified gene(s) integrated into a 
chromosome. Chinese hamster ovary (CHO) and NSO cells are often used for the 

30 production of proteins. 

Derivatives of the plasmid pSV2-dhfr (ATCC Accession No. 37146), the 
expression vectors pC4 (ATCC Accession No. 209646) and pC6 (ATCC Accession 
No.209647) contain the strong promoter (LTR) of the Rous Sarcoma Virus (Cullen et 
al.. Molecular and Cellular Biology, 438-447 (March, 1985)) plus a fragment of the 

35 CMV-enhancer (Boshart et al.. Cell 41:521-530 (1985).) Multiple cloning sites, e.g., 
with the restriction enzyme cleavage sites BamHI, Xbal and Asp718, facilitate the 
cloning of the gene of interest. The vectors also contain the 3' intron, the 
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polyadenylation and termination signal of the rat preproinsulin gene, and the mouse 
DHFR gene under control of the SV40 early promoter. 

Specifically, the plasmid pC6, for example, is digested with appropriate 
restriction enzymes and then dephosphorylated using calf intestinal phosphates by 
5 procedures known in the art. The vector is then isolated from a 1 % agarose gel. 

A polynucleotide of the present invention is amplified according to the protocol 
outlined in Example 1. If the naturally occurring signal sequence is used to produce the 
secreted protein, the vector does not need a second signal peptide. Alternatively, if the 
naturally occurring signal sequence is not used, the vector can be modified to include a 

10 heterologous signal sequence. (See, e.g., WO 96/34891.) 

The amplified fragment is isolated from a 1% agarose gel using a commercially 
available kit ("Geneclean," BIO 101 Inc., La Jolla, Ca.). The fragment then is digested 
with appropriate restriction enzymes and again purified on a 1% agarose gel. 

The amplified fragment is then digested with the same restriction enzyme and 

15 purified on a 1% agarose gel. The isolated fragment and the dephosphorylated vector 
are then ligated with T4 DNA ligase. £. coU HBlOl or XL-1 Blue cells are then 
transformed and bacteria are identified that contain the fragment inserted into plasmid 
pC6 using, for instance, restriction enzyme analysis. 

Chinese hamster ovary cells lacking an active DHFR gene is used for 

20 transfection. Five ^g of the expression plasmid pC6 is cotransfected with 0.5 |ig of the 
plasmid pSVneo using hpofectin (Feigner et al., supra). The plasmid pSV2-neo 
contains a dominant selectable marker, the neo gene from Tn5 encoding an enzyme that 
confers resistance to a group of antibiotics including G418. The cells are seeded in 
alpha minus MEM supplemented with 1 mg/ml G418. After 2 days, the cells are 

25 trypsinized and seeded in hybridoma cloning plates (Greiner, Germany) in alpha minus 
MEM supplemented with 10, 25, or 50 ng/ml of metothrexate plus 1 mg/ml G4I8. 
After about 10-14 days single clones are trypsinized and then seeded in 6-well petri 
dishes or 10 ml flasks using different concentrations of methotrexate (50 nM, 100 nM, 
200 nM, 400 nM, 800 nM). Clones growing at the highest concentrations of 

30 methotrexate are then transferred to new 6-well plates containing even higher 

concenu-ations of methotrexate (1 |iM, 2 |lM, 5 |lM, 10 mM, 20 mM). The same 
procedure is repeated until clones are obtained which grow at a concentration of 100 - 
200 ^M. Expression of the desired gene product is analyzed, for instance, by SDS- 
PAGE and Western blot or by reversed phase HPLC analysis. 



35 
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Example 9: Protein Fusions 

The polypeptides of tlie present invention are preferably fused to other proteins. 
These fusion proteins can be used for a variety of applications. For example, fusion of 
the present polypeptides to His-tag, HA-tag, protein A, IgG domains, and maltose 
5 binding protein facilitates purification. (See Example 5; see also EP A 394,827; 

Traunecker, et al., Nature 331:84-86 (1988).) Similarly, fusion to IgG-I, IgG-3, and 
albumin increases the halflife time in vivo. Nuclear localization signals fused to the 
polypeptides of the present invention can target the protein to a specific subcellular 
localization, while covalent heterodimer or homodimers can increase or decrease the 

10 activity of a fusion protein. Fusion proteins can also create chimeric molecules having 
more than one function. Finally, fusion proteins can increase solubility and/or stability 
of the fused protein compared to the non-fused protein. All of the types of fusion 
proteins described above can be made by modifying the following protocol, which 
outlines the fusion of a polypeptide to an IgG molecule, or the protocol described in 

15 Examples. 

Briefly, the human Fc portion of the IgG molecule can be PGR amplified, using 
primers that span the 5' and 3' ends of the sequence described below. These primers 
also should have convenient restriction enzyme sites that will facilitate cloning into an 
expression vector, preferably a manunalian expression vector. 

20 For example, if pC4 (Accession No. 209646) is used, the human Fc portion can 

be ligated into the BamHI cloning site. Note that the 3' BamHI site should be 
destroyed. Next, the vector containing the human Fc portion is re-resuicted with 
BamHI, linearizing the vector, and a polynucleotide of the present invention, isolated 
by the PGR protocol described in Example 1, is ligated into this BamHI site. Note that 

25 the polynucleotide is cloned without a stop codon, otherwise a fusion protein will not 
be produced. 

If the naturally occurring signal sequence is used to produce the secreted 
protein, pC4 does not need a second signal peptide. Alternatively, if the naturally 
occurring signal sequence is not used, the vector can be modified to include a 
30 heterologous signal sequence. (See, e.g., WO 96/34891.) 

Human IgG Fc region: 

GGGATCCGGAGCCCAAATCTTCTGACAAAACTCACACATGCCCACCGTGCC 
CAGCACCTGAATTCGAGGGTGCACCGTCAGTCTTCCTCTTCCCCCCAAAACC 
35 CAAGGACACCCTCATGATCTCCCGGACTCCTGAGGTCACATGCGTGGTGGT 
GGACGTAAGCCACGAAGACCCTGAGGTCAAGITCAACTGGTACGTGGACG 
GCGTGGAGGTGCATAATGCCAAGACAAAGCCGCGGGAGGAGCAGTACAAC 
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AGCACGTACCGTGTGGTCAGCGTCCTCACCGTCCTGCACCAGGACTGGCTG 

AATGGCAAGGAGTACAAGTGCAAGGTCTCCAACAAAGCCCrCCCAACCCCC 

ATCGAGAAAACCATCTCCAAAGCCAAAGGGCAGCCCCGAGAACCACAGGT 

GTACACCCTGCCCCCATCCCGGGATGAGCTGACCAAGAACCAGGTCAGCCr 

5 GACCTGCCTGGTCAAAGGCTTCTATCCAAGCGACATCGCCGTGGAGTGGGA 

GAGCAATGGGCAGCCGGAGAACAACTACAAGACCACGCCTCCCGTGCTGG 

ACrCCGACGGCTCCrrCTTCCTCTACAGCAAGCTCACCGTGGACAAGA 

GGTGGCAGCAGGGGAACGTCirCTCATGCTCCGTGATGCATGAGGCTCTGC 

ACAACCACTACACGCAGAAGAGCCTCrCCCTGTCTCCGGGTAAATGAGTGC 
1 0 GACGGCCGCGACTCTAGAGGAT (SEQ ID NO: 1 ) 

Example 10; Production o f an Antibody from a Polypep tide 

The antibodies of the present invention can be prepared by a variety of methods. 
(See, Current Protocols, Chapter 2.) For example, cells expressing a polypeptide of 

1 5 the present invention is administered to an animal to induce the producdon of sera 

containing polyclonal andbodies. In a preferred method, a preparation of the secreted 
protein is prepared and purified to render it substantially free of natural contaminants. 
Such a preparation is then introduced into an animal in order to produce polyclonal 
antisera of greater specific activity. 

20 In the most preferred method, the antibodies of the present invention are 

monoclonal antibodies (or protein binding fragments thereof). Such monoclonal 
antibodies can be prepared using hybridoma technology. (Kohler et al., Nature 
256:495 (1975); Kohler et al., Eur. J. Immunol. 6:5 11 (1976); Kohler et al., Eur. J. 
Immunol. 6:292 (1976); Hammerling et al., in: Monoclonal Antibodies and T-Cell 

25 Hybridomas, Elsevier, N.Y., pp. 563-681 (1981).) In general, such procedures 
involve immunizing an animal (preferably a mouse) with polypeptide or, more 
preferably, with a secreted polypeptide-expressing cell. Such cells may be cultured in 
any suitable tissue culture medium; however, it is preferable to culture cells in Earle's 
modified Eagle's medium supplemented widi 10% fetal bovine serum (inacdvated at 

30 about 56X), and supplemented with about 10 g/1 of nonessential amino acids, about 
IfiOO U/ml of penicillin, and about 100 ^ig/ml of streptomycin. 

The splenocytes of such mice are extracted and fused with a suitable myeloma 
cell line. Any suitable myeloma cell line may be employed in accordance with die 
present invention; however, it is preferable to employ die parent myeloma cell line 

35 (SP20), available from the ATCC. After fusion, die resulting hybridoma cells are 
selectively maintained in HAT medium, and dien cloned by limiting dilution as 
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described by Wands et al. (Gastroenterology 80:225-232 (1981).) The hybridoma cells 
obtained through such a selection are then assayed to identify clones which secrete 
antibodies capable of binding the polypeptide. 

Alternatively, additional antibodies capable of binding to the polypeptide can be 
5 produced in a two-step procedure using anti-idiotypic antibodies. Such a method 
makes use of the fact that antibodies are themselves antigens, and therefore, it is 
possible to obtain an antibody which binds to a second antibody. In accordance with 
this method, protein specific antibodies are used to immunize an animal, preferably a 
mouse. The splenocytes of such an animal are then used to produce hybridoma cells, 
10 and the hybridoma cells are screened to identify clones which produce an antibody 

whose ability to bind to the protein-specific antibody can be blocked by the polypeptide. 
Such antibodies comprise anti-idiotypic antibodies to the protein-specific antibody and 
can be used to inraiunize an animal to induce formation of further protein-specific 
antibodies. 

15 It will be appreciated that Fab and F(ab')2 and other fragments of the antibodies 

of the present invention may be used according to the methods disclosed herein. Such 
fragments are typically produced by proteolytic cleavage, using enzymes such as papain 
(to produce Fab fragments) or pepsin (to produce F(ab')2 fragments). Alternatively, 
secreted protein-binding fragments can be produced through the application of 

20 recombinant DNA technology or through synthetic chemistry. 

For in vivo use of antibodies in humans, it may be preferable to use 
"humanized" chimeric monoclonal antibodies. Such antibodies can be produced using 
genetic constructs derived from hybridoma cells producing the monoclonal antibodies 
described above. Methods for producing chimeric antibodies are known in the art. 

25 (See, for review, Morrison, Science 229: 1202 (1985); Oi et al., BioTechniques 4:214 
(1986); Cabilly et al., U.S. Patent No. 4,816,567; Taniguchi et al., EP 171496; 
Morrison et al., EP 173494; Neuberger et al., WO 8601533; Robinson et al., WO 
8702671; Boulianne et al.. Nature 312:643 (1984); Neuberger et al., Nature 314:268 
(1985).) 

30 

Example 11: P roduction Of Secreted Protein For High-Throughput 
Screenini? Assavs 

The following protocol produces a supernatant containing a polypeptide to be 
tested. This supernatant can then be used in the Screening Assays described in 
35 Examples 13-20. 

First, dilute Poly-D-Lysine (644 587 Boehringer-Mannheim) stock solution 
(Img/ml in PBS) 1:20 in PBS (w/o calcium or magnesium 17-516F Biowhittaker) for a 
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working solution of 50ug/nil. Add 200 ul of this solution to each well (24 well plates) 
and incubate at RT for 20 minutes. Be sure to distribute the solution over each well 
(note: a 12-channel pipetter may be used with tips on every other channel). Aspirate off 
the Poly-D-Lysine solution and rinse with Iml PBS (Phosphate Buffered Saline). The 
5 PBS should remain in the well until just prior to plating the cells and plates may be 
poly-lysine coated in advance for up to two weeks. 

Plate 293T cells (do not carry cells past P+20) at 2 x 10-'* cells/well in .5ml 
DMEM(Dulbecco's Modified Eagle Medium)(with 4.5 G/L glucose and L-glutamine 
(12-604F Biowhittaker))/10% heat inactivated FBS(14-503FBiowhittaker)/lx 

10 Penstrep(17-602E Biowhittaker). Let the cells grow overnight. 

The next day, mix together in a sterile solution basin: 300 ul Lipofectamine 
(18324-012 Gibco/BRL) and 5ml Optimem I (31985070 Gibco/BRL)/96-well plate. 
With a small volume multi-channel pipetter, aliquot approximately 2ug of an expression 
vector containing a polynucleotide insert, produced by the methods described in 

1 5 Examples 8 or 9, into an appropriately labeled 96-well round bottom plate. With a 
multi-channel pipetter, add 50ul of the Lipofectamine/Optimem I mixture to each well. 
Pipette up and down gently to mix. Incubate at RT 15-45 minutes. After about 20 
minutes, use a multi-channel pipetter to add 150ul Optimem I to each well. As a 
control, one plate of vector DNA lacking an insert should be transfected with each set of 

20 transfections. 

Preferably, the transfection should be performed by tag-teaming the following 
tasks. By tag-teanning, hands on time is cut in half, and the cells do not spend too 
much time on PBS. First, person A aspirates off the media from four 24-well plates of 
cells, and then person B rinses each well with .5-1 nil PBS. Person A then aspirates off 

25 PBS rinse, and person B, using al2-channel pipetter with tips on every other channel, 
adds the 200ul of DNA/Lipofectamine/Optimem I complex to the odd wells first, then to 
the even wells, to each row on the 24-weIl plates. Incubate at 37°C for 6 hours. 

While cells are incubating, prepare appropriate media, either 1%BSA in DMEM 
with Ix penstrep, or CHO-5 media (1 16.6 mg/L of CaC12 (anhyd); 0.00130 mg/L 

30 CuSO,-5H20; 0.050 mg/L of Fe(N03)3-9H20; 0.417 mg/L of FeSO,-7H20; 31 1.80 
mg/L of Kcl; 28.64 mg/L of MgCl^; 48.84 mg/L of MgSO^; 6995.50 mg/L of NaCl; 
2400.0 mg/L of NaHCO,; 62.50 mg/L of NaH2PO4-H20; 7 1 .02 mg/L of Na^HPCW; 
.4320 mg/L of ZnS04-7H20; .002 mg/L of Arachidonic Acid ; 1.022 mg/L of 
Cholesterol; .070 mg/L of DL-alpha-Tocopherol-Acetate; 0.0520 mg/L of Linoleic 

35 Acid; 0.010 mg/L of Linolenic Acid; 0.010 mg/L of Myristic Acid; 0.010 mg/L of Oleic 
Acid; 0.010 mg/L of Palmitric Acid; 0.010 mg/L of Palmitic Acid; 100 mg/L of 
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Pluronic F-68; 0.010 mg/L of Stearic Acid; 2.20 mg/L of Tween 80; 455 1 mg/L of D- 
Glucose; 130.85 mg/ml of L- Alanine; 147.50 mg/ml of L-Arginine-HCL; 7.50 mg/ml 
of L-Asparagine-H^O; 6.65 mg/ml of L-Aspartic Acid; 29.56 mg/ml of L-Cystine- 
2HCL-H2O; 3 1 .29 mg/ml of L-Cystine-2HCL; 7.35 mg/ml of L-Glutamic Acid; 365.0 
5 mg/ml of L-Glutamine; 1 8.75 mg/ml of Glycine; 52.48 mg/ml of L-Histidine-HCL- 
H2O; 106.97 mg/ml of L-Isoleucine; 111.45 mg/ml of L-Leucine; 163.75 mg/ml ofL- 
Lysine HCL; 32.34 mg/ml of L-Methionine; 68.48 mg/ml of L-Phenylalainine; 40.0 
mg/ml of L-Proline; 26.25 mg/ml of L-Serine; 101.05 mg/ml of L-Threonine; 19.22 
mg/ml of L-Tryptophan; 91.79 mg/ml of L-Tryrosine-2Na-2H20; 99.65 mg/ml of L- 

10 Valine; 0.0035 mg/L of Biotin; 3.24 mg/L of D-Ca Pantothenate; 1 1 .78 mg/L of 
Choline Chloride; 4.65 mg/L of Folic Acid; 15.60 mg/L of i-Inositol; 3.02 mg/L of 
Niacinamide; 3.00 mg/L of Pyridoxal HCL; 0.03 1 mg/L of Pyridoxine HCL; 0.3 19 
mg/L of Riboflavin; 3.17 mg/L of Thiamine HCL; 0.365 mg/L of Thymidine; and 
0.680 mg/L of Vitamin B,^; 25 mM of HEPES Buffer; 2.39 mg/L of Na Hypoxanthine; 

15 0. 105 mg/L of Lipoic Acid; 0.08 1 mg/L of Sodium Putrescine-2HCL; 55.0 mg/L of 
Sodium Pyruvate; 0.0067 mg/L of Sodium Selenite; 20uM of Ethanolamine; 0.122 
mg/L of Ferric Citrate; 41.70 mg/L of Methyl-B-Cyclodextrin complexed with Linoleic 
Acid; 33.33 mg/L of Methyl-B-Cyclodextrin complexed with Oleic Acid; and 10 mg/L 
of Methyl-B-Cyclodextrin complexed with Retinal) with 2mm glutamine and Ix 

20 penstrep. (BSA (81-068-3 Bayer) lOOgm dissolved in IL DMEM for a 10% BSA stock 
solution). Filter the media and collect 50 ul for endotoxin assay in 15ml polystyrene 
conical. 

The transfection reaction is terminated, preferably by tag-teaming, at the end of 
the incubation period. Person A aspirates off the transfection media, while person B 

25 adds 1.5ml appropriate media to each well. Incubate at 3TC for 45 or 72 hours 
depending on the media used: 1%BSA for 45 hours or CHO-5 for 72 hours. 

On day four, using a 300ul multichannel pipetter, aliquot 600ul in one 1ml deep 
well plate and the remaining supernatant into a 2ml deep well. The supematants from 
each well can then be used in the assays described in Examples 13-20. 

30 It is specifically understood that when activity is obtained in any of the assays 

described below using a supernatant, the activity originates from either the polypeptide 
directly (e.g., as a secreted protein) or by the polypeptide inducing expression of other 
proteins, which are then secreted into the supernatant. Thus, the invention fiirther 
provides a method of identifying the protein in the supernatant characterized by an 

35 activity in a particular assay. 
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Example 12: Constr uction of GAS Reporter Construct 

One signal transduction pathway involved in the differentiation and proliferation 
of cells is called the Jaks-STATs pathway. Activated proteins in the Jaks-STATs 
pathway bind to ganuna activation site "GAS" elements or interferon-sensitive 
5 responsive element ("ISRE"), located in the promoter of many genes. The binding of a 
protein to these elements alter the expression of the associated gene. 

GAS and ISRE elements are recognized by a class of transcription factors called 
Signal Transducers and Activators of Transcription, or "STATs." There are six 
members of the STATs family. Statl and Stat3 are present in many cell types, as is 

10 Stat2 (as response to IFN-alpha is widespread). Stat4 is more restricted and is not in 
many cell types though it has been found in T helper class I, cells after treatment with 
IL-12. StatS was originally called mammary growth factor, but has been found at 
higher concentrations in other cells including myeloid cells. It can be activated in tissue 
culture cells by many cytokines. 

15 The STATs are activated to translocate from the cytoplasm to the nucleus upon 

tyrosine phosphorylation by a set of kinases known as the Janus Kinase ("Jaks") 
family. Jaks represent a distinct family of soluble tyrosine kinases and include Tyk2, 
Jakl, Jak2, and Jak3. These kinases display significant sequence similarity and are 
generally catalytically inactive in resting cells. 

20 The Jaks are activated by a wide range of receptors summarized in the Table 

below. (Adapted from review by Schidler and Darnell, Ann. Rev. Biochem. 64:621-51 
(1995)0 A cytokine receptor family, capable of activating Jaks, is divided into two 
groups: (a) Class 1 includes receptors for IL-2, IL-3, IL-4, IL-6, IL-7, IL-9, IL-1 1, IL- 
12, IL-15, Epo, PRL, GH, G-CSF, GM-CSF, LIF, CNTF, and thrombopoietin; and 

25 (b) Class 2 includes IFN-a, IFN-g, and XL- 10. The Class 1 receptors share a 

conserved cysteine motif (a set of four conserved cysteines and one tryptophan) and a 
WSXWS motif (a membrane proxial region encoding Trp-Ser-Xxx-Trp-Ser (SEQ ID 
NO:2)). 

Thus, on binding of a ligand to a receptor, Jaks are activated, which in turn 
30 activate STATs, which then translocate and bind to GAS elements. This entire process 
is encompassed in the Jaks-STATs signal transduction pathway. 

Therefore, activation of the Jaks-STATs pathway, reflected by the binding of 
the GAS or the ISRE element, can be used to indicate proteins involved in the 
proliferation and differentiation of cells. For example, growth factors and cytokines are 
35 known to activate the Jaks-STATs pathway. (See Table below.) Thus, by using GAS 
elements linked to reporter molecules, activators of the Jaks-STATs pathway can be 
identified. 
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10 



15 



20 



25 



30 



35 



40 



Ligand 

IFN family 
IFN-a/B 
IFN-g 
U-IO 

pp 130 family 

IL-6 (Pleiotrohic) 

Il-ll(Pleiotrohic) 

OnM(Pleiotrohic) 

LIF(Pleiotrohic) 

CNTF(Pleiotrohic) 

G-CSF(Pleiotrohic) 

IL-12(Pleiotrohic) 



g-C family 
IL-2 (lymphocytes) 
IL-4 (lymph/myeloid) - 
IL-7 (lymphocytes) 
IL-9 (lymphocytes) 
IL-13 (lymphocyte) 
IL-15 ? 

gpl40 family 
IL-3 (myeloid) 
IL-5 (myeloid) 
GM-CSF (myeloid) - 

Growth hormone family 
GH ? 
PRL ? 
EPO ? 

Receptor Tyrosine Kinases 
EGF ? 
PDGF ? 
CSF-1 ? 



JAKs STATS 
tyk2 Jakl Jak2 Jak3 



+ 
+ 

9 



+ 

+ 
+ 



+ 
+ 
+ 



9 



+ 
+ 
+ 



+ 



+ 
+ 
+ 



+ 

+ 

? 

+ 



1,2,3 
1 

1,3 



+ 


+ 




7 


1,3 


7 


+ 


7 


7 


1,3 


7 


+ 


+ 


7 


1,3 


7 


+ 


+ 


7 


1,3 


-/+ 




+ 


7 


1,3 


7 


+ 


7 


7 


1,3 


+ 






+ 


1,3 



1,3,5 

6 

5 

5 

6 

5 



5 
5 
5 



5 

1.3,5 
5 



1,3 
1,3 
1.3 



GASrelements) or ISRF 



ISRE 

GAS (IRFl>Lys6>IFP) 



GAS (IRFl>Lys6>IFP) 



GAS 

GAS (IRFl = IFF »Ly6)(IgH) 

GAS 

GAS 

GAS 

GAS 



GAS (IRFl>IFP»Ly6) 

GAS 

GAS 



G AS(B-C AS>IRF 1 =IFP»Ly6) 



GAS (IRFl) 
GAS (not IRFl) 
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To construct a synthetic GAS containing promoter element, which is used in the 
Biological Assays described in Examples 13-14, a PGR based strategy is employed to 
generate a GAS-SV40 promoter sequence. The 5' primer contains four tandem copies 
of the GAS binding site found in the IRFl promoter and previously demonstrated to 
5 bind STATs upon induction with a range of cytokines (Rothman et al.. Immunity 

1 :457-468 (1994).), although other GAS or ISRE elements can be used instead. The 5' 
primer also contains 18bp of sequence complementary to the SV40 early promoter 
sequence and is flanked with an Xhol site. The sequence of the 5' primer is: 

S^GCGCCTCGAGATTTCCCCGAAATCTAGATTTCCCCGAAATGATTTCCCCG 
10 AAATGATTTCCCCGAAATATCTGCCATCrCAATTAG:3' (SEQIDNO:3) 

The downsu-eam primer is complementary to the S V40 promoter and is flanked 
with a Hind III site: 5':GCGGCAAGCTTTTTGCAAAGCCTAGGC:3' (SEQ ID 
NO:4) 

PGR amplification is performed using the S V40 promoter template present in 
15 the B-gal:promoter plasmid obtained from Glontech. The resulting PGR fragment is 
digested with Xhol/Hind III and subcloned into BLSK2-. (Stratagene.) Sequencing 
with forward and reverse primers confirms that the insert contains the following 
sequence: 

S'lCTCGAGAmGGCGGAAATGTAGATrrGGGGGAAATGATTTGGGGG 

20 ATTTCCGGGAAATATGTGGGATCTGAATTAGTGAGGAACGATAGTCGGGCCG 

GTAAGTCCGCGGATCCCGGCCGTAACTGGGGGGAGTTCCGGGGATTCTGGGC 

CCGATGGCTGAGTAATTTTrTTTATTTATGGAGAGGCCGAGGCGGGGTGGGG 

GTCTGAGGTATTGCAGAAGTAGTGAGGAGGGTTTTTTGGAGGCCTAGGGTTT 
TGCAAA AAGGTT :3' (SEQIDN0:5) 

25 With this GAS promoter element linked to the SV40 promoter, a GAS:SEAP2 

reporter construct is next engineered. Here, the reporter molecule is a secreted alkaline 
phosphatase, or "SEAP." Glearly, however, any reporter molecule can be instead of 
SEAP, in this or in any of the other Examples. Well known reporter molecules that can 
be used instead of SEAP include chloramphenicol acetyltransferase (GAT), luciferase, 

30 alkaline phosphatase, B-galactosidase, green fluorescent protein (GFP), or any protein 
detectable by an antibody. 

The above sequence confirmed synthetic GAS-S V40 promoter element is 
subcloned into the pSEAP-Promoter vector obtained from Glontech using Hindlll and 
Xhol, effectively replacing the SV40 promoter with the amplified GAS:SV40 promoter 

35 element, to create the GAS-SEAP vector. However, this vector does not contain a 
neomycin resistance gene, and therefore, is not preferred for mammalian expression 
systems. 
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Thus, in order to generate mammalian stable cell lines expressing the GAS- 
SEAP reporter, the GAS-SEAP cassette is removed from the GAS-SEA? vector using 
Sail and NotI, and inserted into a backbone vector containing the neomycin resistance 
gene, such as pGFP-1 (Clontech), using these restriction sites in the multiple cloning 
5 site, to create the GAS-SEAP/Neo vector. Once this vector is transfected into 

manmialian cells, this vector can then be used as a reporter molecule for GAS binding 
as described in Examples 13-14. 

Other constructs can be made using the above description and replacing GAS 
with a different promoter sequence. For example, construction of reporter molecules 

10 containing NFK-B and EGR promoter sequences are described in Examples 15 and 16. 
However, many other promoters can be substituted using the protocols described in 
these Examples. For instance, SRE, IL-2, NFAT, or Osteocalcin promoters can be 
substituted, alone or in combination (e.g., GAS/NF-KB/EGR, GAS/NF-KB, II- 
2/NFAT, or NF-KB/GAS). Similarly, other cell lines can be used to test reporter 

1 5 construct activity, such as HELA (epithelial), HUVEC (endothehal), Reh (B-cell), 
Saos-2 (osteoblast), HUVAC (aortic), or Cardiomyocyte. 

Example 13; Hi^h-Throughput Screening Assay for T-cell Activity. 

The following protocol is used to assess T-cell activity by identifying factors, 

20 such as growth factors and cytokines, that may proliferate or differentiate T-cells. T- 
cell activity is assessed using the GAS/SEAP/Neo construct produced in Example 12. 
Thus, factors that increase SEAP activity indicate the ability to activate the Jaks-STATS 
signal transduction pathway. The T-cell used in this assay is Jurkat T-cells (ATCC 
Accession No. TIB-152), although Molt-3 cells (ATCC Accession No. CRL-1552) and 

25 Molt-4 cells (ATCC Accession No. CRL- 1 582) cells can also be used. 

Jurkat T-cells are lymphoblastic CD44- Thl helper cells. In order to generate 
stable cell lines, approximately 2 million Jurkat cells are transfected with the GAS- 
SEAP/neo vector using DMRIE-C (Life Technologies)(transfection procedure 
described below). The transfected cells are seeded to a density of approximately 

30 20,000 cells per well and transfectants resistant to 1 mg/ml genticin selected. Resistant 
colonies are expanded and then tested for their response to increasing concentrations of 
interferon gamma. The dose response of a selected clone is demonstrated. 

Specifically, the following protocol will yield sufficient cells for 75 wells 
containing 200 ul of cells. Thus, it is either scaled up, or performed in multiple to 

35 generate sufficient cells for multiple 96 well plates. Jurkat cells are maintained in RPMI 
+ 10% serum with l%Pen-Strep. Combine 2.5 mis of OPTI-MEM (Life Technologies) 
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with 10 ug of plasmid DNA in a T25 flask. Add 2.5 ml OPTI-MEM containing 50 ul 
of DMRIE-C and incubate at room temperature for 15-45 mins. 

During the incubation period, count cell concentration, spin down the required 
number of cells (10^ per transfection), and resuspend in OPTI-MEM to a final 
5 concentration of 10^ cells/ml. Then add 1ml of 1 x 10^ cells in OPTI-MEM to T25 flask 
and incubate at ZTC for 6 hrs. After the incubation, add 10 ml of RPMr+ 15% serum. 

The Jurkat:GAS-SEAP stable reporter lines are maintained in RPMI + 10% 
serum, 1 mg/ml Genticin, and 1% Pen-Strep. These cells are treated with supematants 
containing a polypeptide as produced by the protocol described in Example 11. 
10 On the day of treatment with the supernatant, the cells should be washed and 

resuspended in fresh RPMI + 10% serum to a density of 500,000 cells per ml. The 
exact number of cells required will depend on the number of supematants being 
screened. For one 96 well plate, approximately 10 million cells (for 10 plates, 100 
million cells) are required. 
15 Transfer the cells to a triangular reservoir boat, in order to dispense the cells into 

a 96 well dish, using a 12 channel pipette. Using a 12 channel pipette, transfer 200 ul 
of cells into each well (therefore adding 100, 000 cells per well). 

After all the plates have been seeded, 50 ul of the supematants are transferred 
directly from the 96 well plate containing the supematants into each well using a 12 
20 channel pipette. In addition, a dose of exogenous interferon gamma (0. 1 , 1 .0, 10 ng) 
is added to wells H9, HIO, and HI 1 to serve as additional positive controls for the 
assay. 

The 96 well dishes containing Jurkat cells treated with supematants are placed in 
an incubator for 48 hrs (note: this time is variable between 48-72 hrs). 35 ul samples 

25 from each well are then transferred to an opaque 96 well plate using a 12 channel 

pipette. The opaque plates should be covered (using sellophene covers) and stored at - 
20^0 until SEAP assays are performed according to Example 17. The plates 
containing the remaining treated cells are placed at 4^0 and serve as a source of material 
for repeating the assay on a specific well if desired. 

30 As a positive control, 1(K) Unit/ml interferon ganrnia can be used which is 

known to acdvate Jurkat T cells. Over 30 fold induction is typically observed in the 
positive control wells. 
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Example 14: High-Throughput Screening Assay Identifvinp Myeloid 
Activity 

The following protocol is used to assess myeloid activity by identifying factors, 
such as growth factors and cytokines, that may proliferate or differentiate myeloid cells. 
5 Myeloid cell activity is assessed using the GAS/SEAP/Neo construct produced in 

Example 12. Thus, factors that increase SEAP activity indicate the ability to activate the 
Jaks-STATS signal transduction pathway. The myeloid cell used in this assay is U937, 
a pre-monocyte cell line, although TF-1, HL60, or KGl can be used. 

To transiently transfect U937 cells with the GAS/SEAP/Neo construct produced 
10 in Example 12, a DEAE-Dextran method (Kharbanda et. al., 1994, Cell Growth & 
Differentiation, 5:259-265) is used. First, harvest 2x1067 U937 cells and wash with 
PBS. The U937 cells are usually grown in RPMI 1640 medium containing 10% heat- 
inactivated fetal bovine serum (FBS) supplemented with 100 units/ml peniciUin and 100 
mg/ml streptomycin. 

15 Next, suspend the cells in 1 ml of 20 mM Tris-HCl (pH 7.4) buffer containing 

0.5 mg/ml DEAE-Dextran, 8 ug GAS-SEAP2 plasmid DNA, 140 mM NaCl. 5 mM 
KCI, 375 uM Na2HP04.7H20, 1 mM MgCl2, and 675 uM CaCl2. Incubate at 370C 
for 45 min. 

Wash the cells with RPMI 1640 medium containing 10% FBS and then 
20 resuspend in 10 ml complete medium and incubate at 37^C for 36 hr. 

The GAS-SEAP/U937 stable cells are obtained by growing the cells in 400 
ug/ml G418. The G418-free medium is used for routine growth but every one to two 

months, the cells should be re-grown in 400 ug/ml G418 for couple of passages. 

g 

These cells are tested by harvesting 1x10 cells (this is enough for ten 96- well 
25 plates assay) and wash with PBS. Suspend the cells in 200 ml above described growth 
medium, with a final density of 5x10^ cells/ml. Plate 200 ul cells per well in the 96- 
well plate (or 1x10^ cells/well). 

Add 50 ul of the supernatant prepared by the protocol described in Example 1 1 . 
Incubate at 37^0 for 48 to 72 hr. As a positive control, 100 Unit/ml interferon gamma 
30 can be used which is known to activate U937 cells. Over 30 fold induction is typically 
observed in the positive control wells. SEAP assay the supernatant according to the 
protocol described in Example 17. 
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Example 15: High-.Thrnii ghput Screening Assay Identifying Neuronal 
Activity, 

When cells undergo differentiation and proliferation, a group of genes are 
activated through many different signal transduction pathways. One of these genes, 
5 EGRl (early growth response gene 1), is induced in various tissues and cell types upon 
activation. The promoter of EGRl is responsible for such induction. Using the EGRl 
promoter linked to reporter molecules, activation of cells can be assessed. 

Particularly, the following protocol is used to assess neuronal activity in PC 12 
cell lines. PC 12 cells (rat phenochromocytoma cells) are known to proliferate and/or 
10 differentiate by activation with a number of mitogens, such as TP A (teU"adecanoyl 

phorbol acetate), NGF (nerve growth factor), and EGF (epidermal growth factor). The 
EGRl gene expression is activated during this treatment. Thus, by stably Uansfecting 
PC 12 cells with a construct containing an EGR promoter linked to SEAP reporter, 
activation of PC 12 cells can be assessed. 
1 5 The EGR/SEAP reporter construct can be assembled by the following protocol. 

The EGR-1 promoter sequence (-633 to +l)(Sakamoto K et al., Oncogene 6:867-871 
(1991)) can be PCR amplified from human genomic DNA using the following primers: 

5' GCGCTCGAGGGATGACAGCGATAGAACCCCGG -3' (SEQIDNO:6) 

5' GCGAAGCTTCGCGACTCCCCGGATCCGCCTC-3' (SEQIDNO:7) 
20 Using the GAS:SEAP/Neo vector produced in Example 12, EGRl amplified 

product can then be inserted into this vector. Linearize the GAS:SEAP/Neo vector 
using restriction enzymes Xhol/Hindlll, removing the GAS/SV40 stuffer. Restrict the 
EGRl amplified product with these same enzymes. Ligate the vector and the EGRl 
promoter. 

25 To prepare 96 well-plates for cell culture, two mis of a coating solution ( 1 :30 

dilution of collagen type I (Upstate Biotech Inc. Cat#08-1 15) in 30% ethanol (filter 
sterilized)) is added per one 10 cm plate or 50 ml per well of the 96-well plate, and 
allowed to air dry for 2 hr. 

PC 12 cells are routinely grown in RPMI-1640 medium (Bio Whittaker) 

30 containing 10% horse serum (JRH BIOSCIENCES, Cat. # 12449-78P), 5% heat- 
inactivated fetal bovine serum (FBS) supplemented with 100 units/ml penicillin and 100 
ug/ml streptomycin on a precoated 10 cm tissue culture dish. One to four split is done 
every three to four days. Cells are removed from the plates by scraping and 
resuspended with pipetting up and down for more than 15 times. 

35 Transfect the EGR/SEAP/Neo construct into PC 1 2 using the Lipofectamine 

protocol described in Example 11. EGR-SEAP/PC12 stable cells are obtained by 
growing the cells in 300 ug/ml G418. The G418-free medium is used for routine 
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growth but every one to two months, the cells should be re-grown in 300 ug/ml G418 
for couple of passages, 

To assay for neuronal activity, a 10 cm plate with cells around 70 to 80% 
confluent is screened by removing the old medium. Wash the cells once with PBS 
5 (Phosphate buffered saline). Then starve the cells in low serum medium (RPMI-1640 
containing 1% horse serum and 0.5% FBS with antibiotics) overnight. 

The next morning, remove the medium and wash the cells with PBS. Scrape 
off the cells from the plate, suspend the cells well in 2 ml low serum medium. Count 

the cell number and add more low semm medium to reach final cell density as 5x1 0^ 
10 cells/ml. 

Add 200 ul of the cell suspension to each well of 96-well plate (equivalent to 
1x10^ cells/well). Add 50 ul supernatant produced by Example 1 1 , 37^0 for 48 to 72 
hr. As a positive control, a growth factor known to activate PC 12 cells through EGR 
can be used, such as 50 ng/ul of Neuronal Growth Factor (NGF). Over fifty-fold 
15 induction of SEAP is typically seen in the positive control wells. SEAP assay the 
supernatant according to Example 17. 

Example 16: Hi gh-Throughput Screening Assav for T-cell Activity 

NF-kB (Nuclear Factor kB) is a transcription factor activated by a wide variety 
20 of agents including the inflanmiatory cytokines IL- 1 and TNF, CD30 and CD40, 
lymphotoxin-alpha and lymphotoxin-beta, by exposure to LPS or thrombin, and by 
expression of certain viral gene products. As a transcription factor, NF-kB regulates 
the expression of genes involved in immune cell activation, control of apoptosis (NF- 
kB appears to shield cells from apoptosis), B and T-cell development, anti-viral and 
25 antimicrobial responses, and multiple stress responses. 

In non-stimulated conditions, NF- kB is retained in the cytoplasm with I-kB 
(Inhibitor kB). However, upon stimulation, I- kB is phosphorylated and degraded, 
causing NF- kB to shuttle to the nucleus, thereby activating transcription of target 

genes. Target genes activated by NF- kB include IL-2, IL-6, GM-CSF, ICAM-1 and 
30 class 1 MHC. 

Due to its central role and ability to respond to a range of stimuli, reporter 
constructs utilizing the NF-kB promoter element are used to screen the supematants 
produced in Example 11. Activators or inhibitors of NF-kB would be useful in treating 
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diseases. For example, inhibitors of NF-kB could be used to treat those diseases 
related to the acute or chronic activation of NF-kB, such as rheumatoid arthritis. 

To construct a vector containing the NF-kB promoter element, a PCR based 
strategy is employed. The upstream primer contains four tandem copies of the NF-kB 
5 binding site (GGGGACTTTCCC) (SEQ ID NO:8). 1 8 bp of sequence complementary 
to the 5' end of the SV40 early promoter sequence, and is flanked with an Xhol site: 

5':GCGGCCTCGAGGGGACTTTCCCGGGGACTrrCCGGGGACTTTCCGGGAC 

TrTCCATCCTGCCATCTCAATTAG:3' (SEQ ID N0:9) 

The downstream primer is complementary to the 3' end of the SV40 promoter 
10 and is flanked with a Hind III site: 

5':GCGGCAAGCnnTGCAAAGCCTAGGC:3' (SEQ ID N0:4) 

PCR amplificadon is performed using the SV40 promoter template present in 

the pB-gal:promoter plasmid obtained from Clontech. The resulting PCR fragment is 

digested with Xhol and Hind III and subcloned into BLSK2-. (Stratagene) 
1 5 Sequencing with the T7 and T3 primers confirms the insert contains the following 

sequence: 

5':CTCGAGGGGACTTTCCCGGGGACTTTCCGGGGACnTCCGGGACTTTCC 

ATCTGCCATCTCAATTAGTCAGCAACCATAGTCCCGCCCCTAACTCCGCCCA 

20 TCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGCCCCATGGCTGACT 

AAi 1 n 1 1 i 1 ATrrATGCAGAGGCCGAGGCCGCCrCGGCCTCTGAGCTATrC 

CAGAAGTAGTGAGGAGGCTTTTTTGGAGGCCTAGGCnTTGCAAAAAGCTT: 
3' (SEQ ID NO: 10) 

25 Next, replace the SV40 minimal promoter element present in the pSEAP2- 

promoter plasmid (Clontech) with this NF-KB/SV40 fragment using Xhol and Hindlll. 

However, this vector does not contain a neomycin resistance gene, and therefore, is not 
preferred for mammalian expression systems. 

In order to generate stable mammalian cell lines, the NF-KB/SV40/SEAP 
30 cassette is removed from the above NF-kB/SEAP vector using restriction enzymes Sail 
and NotI, and inserted into a vector containing neomycin resistance. Particularly, the 
NF-KB/SV40/SEAP cassette was inserted into pGFP-1 (Clontech), replacing the GFP 
gene, after restricting pGFP-1 with Sail and Notl. 
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Once NF-KB/SV40/SEAP/Neo vector is created, stable Jurkat T-cells are 
created and maintained according to the protocol described in Example 13. Similarly, 
the method for assaying supematants with these stable Jurkat T-cells is also described 
in Example 13. As a positive control, exogenous TNF alpha (0.1,1, 10 ng) is added to 
5 wells H9, HIO, and H 11, with a 5-10 fold activation typically observed. 

Example 17: Assay for SEAP Activity 

As a reporter molecule for the assays described in Examples 13-16, SEAP 
activity is assayed using the Tropix Phospho-light Kit (Cat. BP-400) according to the 
10 following general procedure. The Tropix Phospho-light Kit supplies the Dilution, 
Assay, and Reaction Buffers used below. 

Prime a dispenser with the 2.5x Dilution Buffer and dispense 15 M.1 of 2.5x 
dilution buffer into Optiplates containing 35 jil of a supematant. Seal the plates with a 

plastic sealer and incubate at 65^C for 30 min. Separate the Optiplates to avoid uneven 
15 heating. 

Cool the samples to room temperature for 15 minutes. Empty the dispenser and 

prime with the Assay Buffer. Add 50 |il Assay Buffer and incubate at room 

temperature 5 min. Empty the dispenser and prime with the Reaction Buffer (see the 

table below). Add 50 ^.1 Reaction Buffer and incubate at room temperature for 20 

20 minutes. Since the intensity of the chemiluminescent signal is time dependent, and it 
takes about 10 minutes to read 5 plates on luminometer, one should U*eat 5 plates at each 
time and start the second set 10 minutes later. 

Read the relative light unit in the luminometer. Set H 12 as blank, and print the 
results. An increase in chemiluminescence indicates reporter activity. 

25 



Reaction Buffer Formulation: 



#of plates 


Rxn buffer diluent (ml) 


CSPD (ml) 


10 


60 


3 


11 


65 


3.25 


12 


70 


3.5 


13 


75 


3,75 


14 


80 


4 


15 


85 


4.25 


16 


90 


4.5 


17 


95 


4.75 


18 


100 


5 


19 


105 


5.25 


20 


110 


5.5 


21 


115 


5.75 


22 


120 


6 
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Id 


125 


6.25 




130 


6.5 




135 


6.75 


/O 


140 


7 


27 


145 


7.25 


zo 


150 


7.5 


ZV 


155 


7.75 




160 


8 


il 


165 


8.25 


Jz 


170 


8.5 




175 


8.75 


34 


180 


9 


35 


185 


9.25 


36 


190 


9.5 


37 


195 


9.75 


o 

38 


200 


10 


39 


205 


10.25 


40 


210 


10.5 


41 


215 


10.75 


42 


220 


11 


43 




i 1 .25 


44 


230 


11.5 


45 


235 


11.75 


46 


240 


12 


47 


245 


12.25 


48 


250 


12.5 


49 


255 


12.75 


50 


260 


13 



Example 18: High-Throughput Screening Assay Identifying ChangP*; In 
Small Molecule Con centration and Memhrane Permeahility 

Binding of a ligand to a receptor is known to alter intracellular levels of small 
molecules, such as calcium, potassium, sodium, and pH, as well as alter membrane 
potential. These alterations can be measured in an assay to identify supematants which 
bind to receptors of a particular cell. Although the following protocol describes an 
assay for calcium, this protocol can easily be modified to detect changes in potassium, 
sodium, pH, membrane potential, or any other small molecule which is detectable by a 
fluorescent probe. 

The following assay uses Fluorometric Imaging Plate Reader ("FLIPR") to 
measure changes in fluorescent molecules (Molecular Probes) that bind small 
molecules. Clearly, any fluorescent molecule detecting a small molecule can be used 
instead of the calcium fluorescent molecule, fluo-3, used here. 

For adherent cells, seed the cells at 10,000 -20,000 cells/well in a Co-star black 
96-well plate with clear bottom. The plate is incubated in a CO2 incubator for 20 hours. 
The adherent cells are washed two times in Biotek washer with 200 ul of HBSS 
(Hank's Balanced Salt Solution) leaving 100 ul of buffer after the final wash. 
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A stock solution of 1 mg/ml fluo-3 is made in 10% pluronic acid DMSO. To 
load the cells with fluo-3. 50 ul of 12 ug/ml fluo-3 is added to each well. The plate is 
incubated at 2>TC in a COj incubator for 60 niin. The plate is washed four times in the 
Biotek washer with HBSS leaving 100 ul of buffer, 
5 For non-adherent cells, the cells are spun down from culture media. Cells are 

re-suspended to 2-5x10^ cells/ml with HBSS in a 50-ml conical tube. 4 ul of 1 mg/ml 
fluo-3 solution in 10% pluronic acid DMSO is added to each ml of cell suspension. 
The tube is then placed in a 37°C water bath for 30-60 min. The cells are washed twice 
with HBSS, resuspended to 1x10^ cells/ml, and dispensed into a microplate, 100 
10 ul/well. The plate is centrifuged at 1000 rpm for 5 min. The plate is then washed once 
in Denley CellWash with 200 ul, followed by an aspiration step to 100 ul final volume. 

For a non-cell based assay, each well contains a fluorescent molecule, such as 
fluo-3. The supernatant is added to the well, and a change in fluorescence is detected. 

To measure the fluorescence of intracellular calcium, the FLIPR is set for the 
15 following parameters: (1) System gain is 300-800 mW; (2) Exposure time is 0.4 

second; (3) Camera F/stop is F/2; (4) Excitation is 488 nm; (5) Emission is 530 nm; and 
(6) Sample addition is 50 ul. Increased emission at 530 nm indicates an extracellular 

signaling event which has resulted in an increase in the intracellular Ca++ 
concentration. 

20 

Example 19: High-Throughput Screening Assav Identifying Tyrosine 
Kinase Activity 

The Protein Tyrosine Kinases (PTK) represent a diverse group of 
transmembrane and cytoplasmic kinases. Within the Receptor Protein Tyrosine Kinase 

25 RPTK) group are receptors for a range of mitogenic and metabolic growth factors 
including the PDGF, FGF, EGF, NGF, HGF and Insulin receptor subfamilies. In 
addition there are a large family of RPTKs for which the corresponding ligand is 
unknown. Ligands for RPTKs include mainly secreted small proteins, but also 
membrane-bound and extracellular matrix proteins. 

30 Activation of RPTK by ligands involves ligand-mediated receptor dimerization, 

resulting in transphosphorylation of the receptor subunits and activation of the 
cytoplasmic tyrosine kinases. The cytoplasmic tyrosine kinases include receptor 
associated tyrosine kinases of the src -family (e.g., src, yes, Ick, lyn, fyn) and non- 
receptor linked and cytosolic protein tyrosine kinases, such as the Jak family, members 

35 of which mediate signal transduction triggered by the cytokine superfamily of receptors 
(e.g., the Interleukins, Interferons, GM-CSF, and Leptin). 
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Because of the wide range of known factors capable of stimulating tyrosine 
kinase activity, the identification of novel human secreted proteins capable of activating 
tyrosine kinase signal u-ansduction pathways are of interest. Therefore, the following 
protocol is designed to identify those novel human secreted proteins capable of 
activating the tyrosine kinase signal transduction pathways. 

Seed target cells (e.g., primary keratinocytes) at a density of approximately 
25,000 cells per well in a 96 well Loprodyne Silent Screen Plates purchased from 
Nalge Nunc (NaperviUe, IL). The plates are sterilized with two 30 minute rinses with 
100% ethanol, rinsed with water and dried overnight. Some plates are coated for 2 hr 
with 100 ml of ceil culture grade type I collagen (50 mg/ml), gelatin (2%) or polylysine 
(50 mg/ml), all of which can be purchased from Sigma Chemicals (St. Louis, MO) or 
10% Matrigel purchased from Becton Dickinson (Bedford.MA), or calf semm, rinsed 
with PBS and stored at 40C. Cell growth on these plates is assayed by seeding 5,000 
cells/well in growth medium and indirect quantitation of cell number through use of 
alamarBIue as described by the manufacturer Alamar Biosciences, Inc. (Sacramento, 
CA) after 48 hr. Falcon plate covers #3071 from Becton Dickinson (Bedford.MA) are 
used to cover the Loprodyne Silent Screen Plates. Falcon Microtest III cell culture 
plates can also be used in some proliferation experiments. 

To prepare extracts, A431 cells are seeded onto the nylon membranes of 
Loprodyne plates (20,000/200ml/well) and cultured overnight in complete medium. 
Cells are quiesced by incubation in serum-free basal medium for 24 hr. After 5-20 
minutes treatment with EGF (60ng/ml) or 50 ul of the supernatant produced in Example 
1 1, the medium was removed and 100 ml of extraction buffer ((20 mM HEPES pH 
7.5, 0.15 M NaCl, 1% Triton X-100, 0.1% SDS, 2 mM Na3V04, 2 mM Na4P207 
and a cocktail of protease inhibitors (# 1836170) obtained from Boeheringer Mannheim 
(Indianapolis, IN) is added to each well and the plate is shaken on a rotating shaker for 
5 minutes at 40C. The plate is then placed in a vacuum transfer manifold and the extract 
filtered through the 0.45 mm membrane bottoms of each well using house vacuum. 
Extracts are collected in a 96-well catch/assay plate in the bottom of the vacuum 
manifold and immediately placed on ice. To obtain extracts clarified by centrifugation, 
the content of each well, after detergent solubilization for 5 minutes, is removed and 
centrifuged for 15 minutes at 40C at 16,000 x g. 

Test the filtered extracts for levels of tyrosine kinase activity. Although many 
methods of detecting tyrosine kinase activity are known, one method is described here. 

Generally, the tyrosine kinase activity of a supernatant is evaluated by 
determining its ability to phosphorylate a tyrosine residue on a specific substrate (a 
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biotinylated peptide). Biotinylated peptides that can be used for this purpose include 
PSKl (corresponding to amino acids 6-20 of the cell division kinase cdc2"p34) and 
PSK2 (corresponding to amino acids 1-17 of gastrin). Both peptides are substrates for 
a range of tyrosine kinases and are available from Boehringer Mannheim. 
5 The tyrosine kinase reaction is set up by adding the following components in 

order. First, add lOul of 5uM Biotinylated Peptide, then lOul ATP/Mg2+ (5mM 

ATP/50mM MgCl2), then lOul of 5x Assay Buffer (40mM imidazole hydrochloride, 

pH7.3, 40 mM beta-glycerophosphate, ImM EGTA, lOOmM MgCl2, 5 mM MnCl2, 

0.5 mg/ml BSA), then 5ul of Sodium Vanadate(lmM), and then 5ul of water. Mix the 

10 components gently and preincubate the reaction mix at 30^C for 2 min. Initial the 
reaction by adding lOul of the control enzyme or the filtered supernatant. 

The tyrosine kinase assay reaction is then terminated by adding 10 ul of 120mm 
EDTA and place the reactions on ice. 

Tyrosine kinase activity is determined by transferring 50 ul aliquot of reaction 

1 5 mixture to a microtiter plate (MTP) module and incubating at BV^C for 20 min. This 
allows the streptavadin coated 96 well plate to associate with the biotinylated peptide. 
Wash the MTP module with 300ul/well of PBS four times. Next add 75 ul of anti- 
phospotyrosine antibody conjugated to horse radish peroxidase(anti-P-Tyr- 
POD(0.5u/mI)) to each well and incubate at 37^C for one hour. Wash the well as 

20 above. 

Next add lOOul of peroxidase subsUrate solution (Boehringer Mannheim) and 
incubate at room temperature for at least 5 mins (up to 30 min). Measure the 
absorbance of the sample at 405 nm by using ELISA reader. The level of bound 
peroxidase activity is quantitated using an ELISA reader and reflects the level of 
25 tyrosine kinase activity. 

Example 20; High-Throughput Screening Assay Identifying 
Phosphorylation Activity 

As a potential alternative and/or compliment to the assay of protein tyrosine 
30 kinase activity described in Example 19, an assay which detects activation 

(phosphorylation) of major intracellular signal transduction intermediates can also be 
used. For example, as described below one particular assay can detect tyrosine 
phosphorylation of the Erk-1 and Erk-2 kinases. However, phosphorylation of other 
molecules, such as Raf, JNK, p38 MAP, Map kinase kinase (MEK), MEK kinase, 
35 Src, Muscle specific kinase (MuSK), IRAK, Tec, and Janus, as well as any other 
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phosphoserine, phosphotyrosine, or phosphothreonine molecule, can be detected by 
substituting these molecules for Erk-1 or Erk-2 in the following assay. 

Specifically, assay plates are made by coating the wells of a 96-well ELISA 
plate with 0.1ml of protein G (lug/ml) for 2 hr at room temp, (RT). The plates are then 
5 rinsed with PBS and blocked with 3% BSA/PBS for 1 hr at RT. The protein G plates 
are then treated with 2 commercial monoclonal antibodies (lOOng/well) against Erk-l 
and Erk-2 ( 1 hr at RT) (Santa Cruz Biotechnology). (To detect other molecules, this 
step can easily be modified by substituting a monoclonal antibody detecting any of the 

above described molecules.) After 3-5 rinses with PBS, the plates are stored at 4^C 
10 until use. 

A431 cells are seeded at 20,000/well in a 96-well Loprodyne filterplate and 
cultured overnight in growth medium. The cells are then starved for 48 hr in basal 
medium (DMEM) and then treated with EGF (6ng/well) or 50 ul of the supematants 
obtained in Example 1 1 for 5-20 minutes. The cells are then solubilized and extracts 

15 filtered directly into the assay plate. 

After incubation with the extract for 1 hr at RT, the wells are again rinsed. As a 
positive control, a commercial preparation of MAP kinasfe (lOng/well) is used in place 
of A431 exu-act. Plates are then treated with a commercial polyclonal (rabbit) antibody 
(lug/ml) which specifically recognizes the phosphorylated epitope of the Erk-1 and 

20 Erk-2 kinases (1 hr at RT). This antibody is biotinylated by standard procedures. The 
bound polyclonal antibody is then quantitated by successive incubations with 
Europium-streptavidin and Europium fluorescence enhancing reagent in the Wallac 
DELFIA instrument (time-resolved fluorescence). An increased fluorescent signal over 
background indicates a phosphorylation. 

25 

Example 21: M ethod of Determining Alterations in a Gene 
Corresponding to a Polynucleotide 

RNA isolated from entire families or individual patients presenting with a 
phenotype of interest (such as a disease) is be isolated. cDNA is then generated from 
30 these RNA samples using protocols known in the art. (See, Sambrook.) The cDNA is 
then used as a template for PCR, employing primers surrounding regions of interest in 
SEQ ID NO:X. Suggested PCR conditions consist of 35 cycles at 95°C for 30 

seconds; 60-120 seconds at 52-58°C; and 60-120 seconds at 70X, using buffer 
solutions described in Sidransky, D., et al.. Science 252:706 (1991). 
35 PCR products are then sequenced using primers labeled at their 5' end with T4 

polynucleotide kinase, employing SequiTherm Polymerase. (Epicentre Technologies). 
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The intron-exon borders of selected exons is also determined and genomic PCR 
products analyzed to confirm the results. PCR products harboring suspected mutations 
is then cloned and sequenced to validate the results of the direct sequencing. 

PCR products is cloned into T-tailed vectors as described in Holton, T.A. and 

5 Graham, M.W., Nucleic Acids Research, 19:1 156 (1991) and sequenced with T7 
polymerase (United States Biochemical). Affected individuals are identified by 
mutations not present in unaffected individuals. 

Genomic rearrangements are also observed as a method of determining 
alterations in a gene corresponding to a polynucleotide. Genomic clones isolated 

10 according to Example 2 are nick-translated with digoxigenindeoxy-uridine 5'- 

triphosphate (Boehringer Manheim), and FISH performed as described in Johnson, 
Cg. et al.. Methods Cell BioL 35:73-99 (1991). Hybridization with the labeled probe is 
carried out using a vast excess of human cot-1 DNA for specific hybridization to the 
corresponding genomic locus. 

15 Chromosomes are counterstained with 4,6-diamino-2-phenylidole and 

propidium iodide, producing a combination of C- and R-bands. Aligned images for 
precise mapping are obtained using a triple-band filter set (Chroma Technology, 
Bratdeboro, VT) in combination with a cooled charge-coupled device camera 
(Photometries, Tucson, AZ) and variable excitation wavelength filters. (Johnson, Cv. 

20 et al., Genet. Anal. Tech. Appl., 8:75 (1991).) Image collection, analysis and 

chromosomal fractional length measurements are performed using the ISee Graphical 
Program System. (Inovision Corporation, Durham, NC.) Chromosome alterations of 
the genomic region hybridized by the probe are identified as insertions, deletions, and 
translocations. These alterations are used as a diagnostic marker for an associated 

25 disease. 

Example 22: Method of Detecting Abnormal Levels of a Polypeptide in a 
Biological Sample 

A polypeptide of the present invention can be detected in a biological sample, 
30 and if an increased or decreased level of the polypeptide is detected, this polypeptide is 
a marker for a particular phenotype. Methods of detection are numerous, and thus, it is 
understood that one skilled in the art can modify the following assay to fit their 
particular needs. 

For example, antibody-sandwich ELISAs are used to detect polypeptides in a 
35 sample, preferably a biological sample. Wells of a microtiter plate are coated with 

specific antibodies, at a final concentration of 0.2 to 10 ug/ml. The antibodies are either 
monoclonal or polyclonal and are produced by the method described in Example 10. 
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The wells are blocked so that norj-specific binding of the polypeptide to the well is 
reduced. 

The coated wells are then incubated for > 2 hours at RT with a sample 
containing the polypeptide. Preferably, serial dilutions of the sample should be used to 
5 validate results. The plates are then washed three times with deionized or distilled water 
to remove unbounded polypeptide. 

Next, 50 ul of specific antibody-alkaline phosphatase conjugate, at a 
concentration of 25-400 ng, is added and incubated for 2 hours at room temperature. 
The plates are again washed three times with deionized or disUlled water to remove 
10 unbounded conjugate. 

Add 75 ul of 4-methylumbelliferyl phosphate (MUP) or p-nitrophenyl 
phosphate (NPP) substrate solution to each well and incubate 1 hour at room 
temperature. Measure the reaction by a microtiter plate reader. Prepare a standard 
curve, using serial dilutions of a control sample, and plot polypeptide concentration on 
1 5 the X-axis (log scale) and fluorescence or absorbance of the Y-axis (linear scale). 

Interpolate the concentration of the polypeptide in the sample using the standard curve. 

Example 23: Formulating a Polvpep fidp 

The secreted polypeptide composition will be formulated and dosed in a fashion 

20 consistent with good medical practice, taking into account the clinical condition of the 
individual patient (especially the side effects of treatment with the secreted polypeptide 
alone), the site of delivery, the method of administration, the scheduling of 
administration, and other factors known to practitioners. The "effective amount" for 
purposes herein is thus determined by such considerations. 

25 As a general proposition, the total pharmaceutically effective amount of secreted 

polypeptide administered parenterally per dose will be in the range of about 1 ng/kg/day 
to 10 mg/kg/day of patient body weight, although, as noted above, this will be subject 
to therapeutic discretion. More preferably, this dose is at least 0.01 mg/kg/day, and 
most preferably for humans between about 0.01 and 1 mg/kg/day for the hormone. If 

30 given continuously, the secreted polypeptide is typically administered at a dose rate of 
about 1 M.g/kg/hour to about 50 fig/kg/hour, either by 1-4 injections per day or by 
continuous subcutaneous infusions, for example, using a mini-pump. An intravenous 
bag solution may also be employed. The length of treatment needed to observe changes 
and the interval following tteatment for responses to occur appears to vary depending 

35 on the desired effect. 

Pharmaceutical compositions containing the secreted protein of the invention are 
administered orally, rectally, parenterally, intracistemally, intravaginally, 
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intraperitoneally, topically (as by powders, ointments, gels, drops or transdermal 
patch), bucally, or as an oral or nasal spray. Tharmaceutically acceptable carrier" refers 
to a non-toxic solid, semisolid or liquid filler, diluent, encapsulating material or 
formulation auxiliary of any type. The term "parenteral" as used herein refers to modes 
5 of administration which include intravenous, intramuscular, inuraperitoneal, intrastemal, 
subcutaneous and intraarticular injection and infusion. 

The secreted polypeptide is also suitably administered by sustained-release 
systems. Suitable examples of sustained-release compositions include semi-permeable 
polymer matrices in the form of shaped articles, e.g., films, or mirocapsules. 

10 Sustained-release matrices include polylactides (U.S. Pat. No. 3,773,919, EP 58,481), 
copolymers of L-glutamic acid and gamma-ethyl-L-glutamate (Sidman, U. et al., 
Biopolymers 22:547-556 (1983)), poly (2- hydroxyethyl methacrylate) (R. Langer et 
al., J. Biomed. Mater. Res. 15:167-277 (1981), and R. Langer, Chem. Tech. 12:98- 
105 C1982)), ethylene vinyl acetate (R. Langer et al.) or poly-D- (-)-3-hydroxybutyric 

15 acid (EP 133,988). Sustained-release compositions also include liposomally entrapped 
polypeptides. Liposomes containing the secreted polypeptide are prepared by methods 
known per se: DE 3,218,121; Epstein et al., Proc. Natl. Acad. Sci. USA 82:3688-3692 
(1985); Hwang et aL, Proc. Natl. Acad. Sci. USA 77:4030-4034 (1980); EP 52,322; 
EP 36,676; EP 88,046; EP 143,949; EP 142,641; Japanese Pat. Appl. 83-1 18008; 

20 U.S. Pat. Nos. 4,485,045 and 4,544,545; and EP 102,324. Ordinarily, the liposomes 
are of the small (about 200-800 Angstroms) unilamellar type in which the lipid content 
is greater than about 30 mol. percent cholesterol, the selected proportion being adjusted 
for the optimal secreted polypeptide therapy. 

For parenteral administration, in one embodiment, the secreted polypeptide is 

25 formulated generally by mixing it at the desired degree of purity, in a unit dosage 

injectable form (solution, suspension, or emulsion), with a pharmaceutically acceptable 
carrier, i.e., one that is non-toxic to recipients at the dosages and concentrations 
employed and is compatible with other ingredients of the formulation. For example, the 
formulation preferably does not include oxidizing agents and other compounds that are 

30 known to be deleterious to polypeptides. 

Generally, the formulations are prepared by contacting the polypeptide 
uniformly and intimately with liquid carriers or finely divided solid carriers or both. 
Then, if necessary, the product is shaped into the desired formulation. Preferably the 
carrier is a parenteral carrier, more preferably a solution that is isotonic with the blood 

35 of the recipient. Examples of such carrier vehicles include water, saline. Ringer's 
solution, and dextrose solution. Non-aqueous vehicles such as fixed oils and ethyl 
oleate are also useful herein, as well as liposomes. 
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The carrier suitably contains minor amounts of additives such as substances that 
enhance isotonicity and chemical stability. Such materials are non-toxic to recipients at 
the dosages and concentrations employed, and include buffers such as phosphate, 
citrate, succinate, acetic acid, and other organic acids or their salts; antioxidants such as 
5 ascorbic acid; low molecular weight (less than about ten residues) polypeptides, e.g„ 
polyarginine or tripeptides; proteins, such as serum albumin, gelatin, or 
immunoglobulins; hydrophilic polymers such as polyvinylpyrrolidone; amino acids, 
such as glycine, glutamic acid, aspartic acid, or arginine; monosaccharides, 
disaccharides, and other carbohydrates including cellulose or its derivatives, glucose, 

10 manose, or dextrins; chelating agents such as EDTA; sugar alcohols such as mannitol or 
sorbitol; counterions such as sodium; and/or nonionic surfactants such as polysorbates, 
poloxamers, or PEG. 

The secreted polypeptide is typically formulated in such vehicles at a 
concentration of about 0.1 mg/ml to 100 mg/ml, preferably 1-10 mg/ml, at a pH of 

1 5 about 3 to 8. It will be understood that the use of certain of the foregoing excipients, 
carriers, or stabilizers will result in the formation of polypeptide salts. 

Any polypeptide to be used for therapeutic administration can be sterile. 
Sterility is readily accomplished by filtration through sterile filtration membranes (e.g., 
0.2 micron membranes). Therapeutic polypeptide compositions generally are placed 

20 into a container having a sterile access port, for example, an intravenous solution bag or 
vial having a stopper pierceable by a hypodermic injection needle. 

Polypeptides ordinarily will be stored in unit or multi-dose containers, for 
example, sealed ampoules or vials, as an aqueous solution or as a lyophilized 
formulation for reconstitution. As an example of a lyophilized formulation, 10-ml vials 

25 are filled with 5 ml of sterile-filtered 1% (w/v) aqueous polypeptide solution, and the 
resulting mixture is lyophilized. The infusion solution is prepared by reconstituting the 
lyophilized polypeptide using bacteriostatic Water-for-Injection. 

The invention also provides a pharmaceudcal pack or kit comprising one or 
more containers filled with one or more of the ingredients of the pharmaceutical 

30 compositions of the invention. Associated with such container(s) can be a notice in the 
form prescribed by a governmental agency regulating the manufacture, use or sale of 
pharmaceuticals or biological products, which notice reflects approval by the agency of 
manufacture, use or sale for human administration. In addition, the polypeptides of the 
present invention may be employed in conjunction with other therapeutic compoimds. 
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Example 24: Method of Treating Decreased Levels of the Polypeptide 

It will be appreciated that conditions caused by a decrease in the standard or 
normal expression level of a secreted protein in an individual can be treated by 
administering the polypeptide of the present invention, preferably in the secreted form. 
5 Thus, the invention also provides a method of treatment of an individual in need of an 
increased level of the polypeptide comprising administering to such an individual a 
pharmaceutical composition comprising an amount of the polypeptide to increase the 
activity level of the polypeptide in such an individual. 

For example, a patient with decreased levels of a polypeptide receives a daily 
10 dose 0.1-100 ug/kg of the polypeptide for six consecutive days. Preferably, the 

polypeptide is in the secreted form. The exact details of the dosing scheme, based on 
administration and formulation, are provided in Example 23. 

Example 25: Method of Treating Increased Levels of the Polypeptide 

15 Antisense technology is used to inhibit production of a polypeptide of the 

present invention. This technology is one example of a method of decreasing levels of 
a polypeptide, preferably a secreted form, due to a variety of etiologies, such as cancer. 

For example, a patient diagnosed with abnormally increased levels of a 
polypeptide is administered intravenously antisense polynucleotides at 0.5, 1.0, 1.5, 

20 2.0 and 3.0 mg/kg day for 21 days. This treatment is repeated after a 7-day rest period 
if the treatment was well tolerated. The formulation of the antisense polynucleotide is 
provided in Example 23. 

Example 26: Method of Treatment Using Gene Therapy 

25 One method of gene therapy transplants fibroblasts, which are capable of 

expressing a polypeptide, onto a patient. Generally, fibroblasts are obtained from a 
subject by skin biopsy. The resulting tissue is placed in tissue-culture medium and 
separated into small pieces. Small chunks of the tissue are placed on a wet surface of a 
tissue culture flask, approximately ten pieces are placed in each flask. The flask is 

30 turned upside down, closed tight and left at room temperature over night. After 24 

hours at room temperature, the flask is inverted and the chunks of tissue remain fixed to 
the bottom of the flask and fresh media (e.g., Ham*s F12 media, with 10% FBS, 

penicillin and streptomycin) is added. The flasks are then incubated at 3TC for 

approximately one week. 



wo 98/54963 



PCT/US98/11422 



261 

At this time, fresh media is added and subsequently changed every several days. 
After an additional two weeks in culture, a monolayer of fibroblasts emerge. The 
monolayer is trypsinized and scaled into larger flasks. 

pMV-7 (Kirschmeier, P.T. et al., DNA, 7:219-25 (1988)), flanked by the long 
5 terminal repeats of the Moloney murine sarcoma virus, is digested with EcoRI and 
Hindlll and subsequently treated with calf intestinal phosphatase. The linear vector is 
fractionated on agarose gel and purified, using glass beads. 

The cDNA encoding a polypeptide of the present invention can be amplified 
using PGR primers which correspond to the 5' and 3' end sequences respectively as set 
10 forth in Example 1. Preferably, the 5' primer contains an EcoRI site and the 3' primer 
includes a Hindlll site. Equal quantities of the Moloney murine sarcoma vims linear 
backbone and the amplified EcoRI and Hindlll fragment are added together, in the 
presence of T4 DNA ligase. The resulting mixture is maintained under conditions 
appropriate for ligation of the two fragments. The ligation mixture is then used to 
15 transform bacteria HB 101, which are then plated onto agar containing kanamycin for 
the purpose of confirming that the vector has the gene of interest properly inserted. 

The amphotropic pA317 or GP+aml2 packaging cells are grown in tissue 
culture to confluent density in Dulbecco's Modified Eagles Medium (DMEM) with 10% 
calf serum (CS), penicillin and streptomycin. The MSV vector containing the gene is 
20 then added to the media and the packaging cells U-ansduced with the vector. The 
packaging cells now produce infectious viral particles containing the gene (the 
packaging cells are now referred to as producer cells). 

Fresh media is added to the transduced producer cells, and subsequently, the 
media is harvested from a 10 cm plate of confluent producer cells. The spent media, 
25 containing the infectious viral particles, is filtered through a millipore filter to remove 
detached producer cells and this media is then used to infect fibroblast cells. Media is 
removed from a sub-confluent plate of fibroblasts and quickly replaced with the media 
from the producer cells. This media is removed and replaced with fresh media. If the 
titer of virus is high, then virtually all fibroblasts will be infected and no selection is 
30 required. If the titer is very low, then it is necessary to use a retroviral vector that has a 
selectable marker, such as neo or his. Once the fibroblasts have been efficienUy 
infected, the fibroblasts are analyzed to determine whether protein is produced. 

The engineered fibroblasts are then transplanted onto the host, either alone or 
after having been grown to confluence on cytodex 3 microcarrier beads. 
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Exam ple 27: Method of Treatment Using Gene Therapy - In Vivo 

Another aspect of the present invention is using in vivo gene therapy 
methods to treat disorders, diseases and conditions. The gene therapy method 
relates to the introduction of naked nucleic acid (DNA, RNA, and antisense 
DNA or RNA) sequences into an animal to increase or decrease the expression 
of the polypeptide of the present invention. A polynucleotide of the present 
invention may be operatively linked to a promoter or any other genetic elements 
necessary for the expression of the encoded polypeptide by the target tissue. 
Such gene therapy and delivery techniques and methods are known in the art, 
see, for example, WO90/1 1092, W098/1 1779; U.S. Patent NO. 5693622, 
5705151, 5580859; Tabata H. et al. (1997) Cardiovasc. Res. 35(3):470-479, 
Chao J et al. (1997) Pharmacol. Res. 35(6):5 17-522, Wolff J.A. (1997) 
Neuromuscul. Disord. 7(5):314-318, Schwartz B. et al. (1996) Gene Ther. 
3(5):405-41 1, Tsurumi Y. et al. (1996) Circulation 94(12):3281-3290 
(incorporated herein by reference). 

The polynucleotide constructs of the present invention may be delivered 
by any method that delivers injectable materials to the cells of an animal, such 
as, injection into the interstitial space of tissues (hean, muscle, skin, lung, liver, 
intestine and the like). These polynucleotide constructs can be delivered in a 
pharmaceutically acceptable liquid or aqueous carrier. 

The term "naked" polynucleotide, DNA or RNA, refers to sequences 
that are free from any delivery vehicle that acts to assist, promote, or facilitate 
entry into the cell, including viral sequences, viral particles, liposome 
formulations, lipofectin or precipitating agents and the like. However, the 
polynucleotides may also be delivered in liposome formulations (such as those 
taught in Feigner P.L. et al. (1995) Ann. NY Acad. Sci. 772: 126-139 and 
Abdallah B. et al. (1995) Biol. CeU 85(1): 1-7) which can be prepared by 
methods well known to those skilled in the art. 

The polynucleotide vector constructs of the present invention used in 
the gene therapy method are preferably constructs that will not integrate into the 
host genome nor will they contain sequences that allow for replication. Any 
strong promoter known to those skilled in the art can be used for driving the 
expression of DNA. Unlike other gene therapies techniques, one major 
advantage of introducing naked nucleic acid sequences into target cells is the 
transitory nature of the polynucleotide synthesis in the cells. Studies have 
shown that non-replicating DNA sequences can be introduced into cells to 
provide production of the desired polypeptide for periods of up to six months. 
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The polynucleotide construct of the present invention can be delivered to 
the interstitial space of tissues within the an animal, including of muscle, skin, 
brain, lung, liver, spleen, bone marrov^, thymus, heart, lymph, blood, bone, 
cartilage, pancreas, kidney, gall bladder, stomach, intestine, testis, ovary, 
5 uterus, rectum, nervous system, eye, gland, and connective tissue. Interstitial 
space of the tissues comprises the intercellular fluid, mucopolysaccharide matrix 
among the reticular fibers of organ tissues, elastic fibers in the walls of vessels or 
chambers, collagen fibers of fibrous tissues, or that same matrix within 
connective tissue ensheathing muscle cells or in the lacunae of bone. It is 

10 similarly the space occupied by the plasma of the circulation and the lymph fluid 
of the lymphatic channels. Delivery to the interstitial space of muscle tissue is 
preferred for the reasons discussed below. They may be conveniently delivered 
by injection into the tissues comprising these cells. They are preferably delivered 
to and expressed in persistent, non-dividing cells which are differentiated, 

15 although delivery and expression may be achieved in non-differentiated or less 
completely differentiated cells, such as, for example, stem cells of blood or skin 
fibroblasts. In vivo muscle cells are particularly competent in their ability to take 
up and express polynucleotides. 

For the naked polynucleotide injection, an effective dosage amount of 

20 DNA or RNA will be in the range of from about 0.05 g/kg body weight to about 
50 mg/kg body weight. Preferably the dosage will be from about 0.005 mg/kg 
to about 20 mg/kg and more preferably from about 0.05 mg/kg to about 5 mg/kg. 
Of course, as the artisan of^ordinary skill will appreciate, this dosage will vary 
according to the tissue site of injection. The appropriate and effective dosage of 

25 nucleic acid sequence can readily be determined by those of ordinary skill in the 
art and may depend on the condition being treated and the route of 
administration. The preferred route of administration is by the parenteral route of 
injection into the interstitial space of tissues. However, other parenteral routes 
may also be used, such as, inhalation of an aerosol formulation particularly for 

30 dehvery to lungs or bronchial tissues, throat or mucous membranes of the nose. 
In addition, naked polynucleotide constructs can be delivered to arteries during 
angioplasty by the catheter used in the procedure. 

The dose response effects of injected polynucleotide in muscle in vivo is 
determined as follows. Suitable template DNA for production of mRNA coding 

35 for the polypeptide of the present invention is prepared in accordance with a 
standard recombinant DNA methodology. The template DNA, which may be 
either circular or linear, is either used as naked DNA or complexed with 
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liposomes. The quadriceps muscles of mice are then injected with various 
amounts of the template DNA. 

Five to six week old female and male Balb/C mice are anesthetized by 
intraperitoneal injection with 0.3 ml of 2.5% Avertin. A 1.5 cm incision is made 

5 on the anterior thigh, and the quadriceps muscle is directly visualized. The 

template DNA is injected in 0.1 ml of carrier in a 1 cc syringe through a 27 gauge 
needle over one minute, approximately 0.5 cm from the distal insertion site of the 
muscle into the knee and about 0.2 cm deep. A suture is placed over the 
injection site for future localization, and the skin is closed with stainless steel 

10 clips. 

After an appropriate incubauon time (e.g., 7 days) muscle extracts are prepared 
by excising the entire quadriceps. Every fifth 15 um cross-section of the individual 
quadriceps muscles is histochenucally stained for protein expression. A time course for 
protein expression may be done in a similar fashion except that quadriceps from 

1 5 different mice are harvested at different times. Persistence of DNA in muscle following 
injection may be determined by Southern blot analysis after preparing total cellular DNA 
and HIRT supematants from injected and control mice. The results of the above 
experimentation in mice can be use to extrapolate proper dosages and other treatment 
parameters in humans and other animals using naked DNA of the present invention. 

20 It will be clear that the invention may be practiced otherwise than as particularly 

described in the foregoing description and examples. Numerous modifications and 
variations of the present invention are possible in light of the above teachings and, 
therefore, are within the scQpe of the appended claims. 

The entire disclosure of each document cited (including patents, patent 

25 applications, journal articles, abstracts, laboratory manuals, books, or other 

disclosures) in the Background of the Invention, Detailed Description, and Examples is 
hereby incorporated herein by reference. 
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Sequence Listing 

(1) GENERAL INFOE?MA.TION: 

(i) APPLICANT: Human Genome Sciences, Inc., et al . 

(ii) TITLE OF INVENTION: 207 Human Secreted Proteins 

(iii) NUMBER OF SEQUENCES: 800 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Human Genome Sciences, Inc. 

(B) STREET: 9410 Key West Avenue 

(C) CITY: Roc)cville 

(D) STATE: Maryland 

(E) COUNTRY: USA 

(F) ZIP: 20850 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette, 3.50 inch, 1.4Mb storage 

(B) COMPUTER: HP Vectra 486/33 

(C) OPERATING SYSTEM: MSDOS version 6.2 

(D) SOFTOARE: ASCII Text 

(vi) CURRE^7^ APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA; 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 
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(viii) ATTORNEY/ AGEOT INFORMATION: 

(A) NAME: Kenley K. Hoover 

(B) REGISTRATION NUMBER: 40,302 

(C) REFERENCE/DOCKET NUMBER: PZ007PCT 

(vi) TELECOMMUNICATION INFORMATICS! : 

(A) TELEPHONE: (301) 309-8504 

(B) TELEFAX: (301) 309-8439 



20 



30 



(2) INFORMATION FOR SEQ ID NO: 1: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 733 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

GGGATCCGGA GCCCAAATCT TCTGACAAAA CTCACACATG CCCACCGTGC CCAGCACCTG 60 

AATTCGAGGG TGCACCGTCA GTCTTCCTCT TCCCCCCAAA ACCCAAGGAC ACCCTCATGA 120 

35 TCTCCCGGAC TCCTGAGGTC ACATGCGTGG TGGTGGACGT AAGCCACGAA GACCCTGAGG 180 

TCAAGTTCAA CTGGTACGTG GACGGCGTGG AGGTGCATAA TGCCAAGACA AAGCCGCGGG 240 

AGGAGCAGTA CAACAGCACG TACCGTGTGG TCAGCGTCCT CACCGTCCTG CACCAGGACT 300 

40 

GGCTGAATGG CAAGGAGTAC AAGTGCAAGG TCTCCAACAA AGCCCTCCCA ACCCCCATCG 360 

AGAAAACCAT CTCCAAAGCC AAAGGGCAGC CCCGAGAACC ACAGGTGTAC ACCCTGCCCC 420 

45 CATCCCGGGA TGAGCTGACC AAGAACCAGG TCAGCCTGAC CTGCCTGGTC AAAGGCTTCT 480 

ATCCAAGCGA CATCGCCGTG GAGTGGGAGA GCAATGGGCA GCCGGAGAAC AACTACAAGA 540 

CCACGCCTCC CGTGCTGGAC TCCGACGGCT CCTTCTTCCT CTACAGCAAG CTCACCGTGG 600 

50 

ACAAGAGCAG GTGGCAGCAG GGGAACGTCT TCTCATGCTC CGTGATGCAT GAGGCTCTGC 660 

ACAACCACTA CACGCAGAAG AGCCTCTCCC TGTCTCCGGG TAAATGAGTG CGACGGCCGC 720 

55 GACTCTAGAG GAT 733 



60 



(2) INFORMATION FOR SEQ ID NO: 2: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



Trp Ser Xaa Trp Ser 
10 1 5 



15 (2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 86 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
25 GCGCCTCGAG ATTTCCCCGA AATCTAGATT TCCCCGAAAT GATTTCCCCG AAATGATTTC 
CCCGAAATAT CTGCCATCTC AATTAG 



30 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY; linear 

40 (xi) SEQUENCrE DESCRIPTION: SEQ ID NO: 4: 

GCGGCAAGCrr TTTTGCAAAG CCTAGGC 



45 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH: 271 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

55 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

CICGAGATTT CCCCGAAATC TAGATTTCCC CGAAATGATT TCCCCGAAAT GATTTCCCCG 
AAATATCTGC CATCTCAATT AGTCAGCAAC CATAGTCCCG CCCCTAACTC CGCCCATCCC 
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GCCCCTAACT CCGCCCAGTT CCGCCCATTC TCCGCCCCAT GGCTGACTAA TTTTTTTTAT 
TTATGCAGAG GCCGAGGCCG CCTCGGCCTC TGAGCTATTC CAGAAGTAGT GAGGAGGCTT 
TTTTGGAGGC CTAGGCTTTT GCAAAAAGCT T 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

Ui) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
GCGCTCGAGG GATGACAGCG ATAGAACCCC GG 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 
{B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 



GCGAAGCTTC GCGACTCCCC GGATCCGCCT C 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
GGGGACTTTC CC 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 73 base pairs 

(B) TYPE; nucleic acid 
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(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
GCGGCCTCGA GGGGACTTTC CCGGGGACTT TCCGGGGACT TTCCGGGACT TTCCATCCTG 
CCATCTCAAT TAG 



60 
73 



10 

(2) INFORMATION FOR SEQ ID NO: 10: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LEtXTTH: 256 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: doxible 

(D) TOPOLOGY: linear 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
CTCGAGGGGA CTTTCCCGGG GACTTTCCGG GGACTTTCCG GGACTTTCCA TCTGCCATCT 60 
25 CAATTAGTCA GCAACCATAG TCCCGCCCCT AACTCCGCCC ATCCCGCCCC TAACTCCGCC 120 
CAGTTCCGCC CATTCTCCGC CCCATGGCTG ACTAATTTTT TTTATTTATG CAGAGGCCGA 180 
GGCCGCCTCG GCCTCTCAGC TATTCCAGAA GTAGTGAGGA GGCiTITTTG GAGGCCTAGG 240 

30 

CTTTTGCAAA AAGCTT 



256 



35 



45 



55 



(2) INFORMATION FOR SEQ ID NO: 11: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2526 base pairs 
40 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

GACAGGCTAT CCGAGAATCT GAGAGCTGGG CCCGGCAATT CCTCCAGYTA CCCTTGTGAC 60 
CTAAGTCCAG TCACACATIT CCCAAAGTTT CTCTTTGTCA TAACCCTGGT CTGGCTGGTT . 120 

50 TTGRGGRCTT GAGAATGGGT CAGGGACTCC AGGCCAAGTC CAACAGAGAC CCCAAACCCA 180 

CCACACACCA GCAGCCACAA CCTCACCACC AACAAAGAGG ACTTTTGTGG GGCCACAAGT 240 

AAGAGGTCAT TTCTCGAATG GACTCAGACC TTTAAACAGG AGAGTTGAGC ACTTCCAGKS 300 

AGTnTTAAG CAAGGCATGG GGAACAGGGA ATAGAACCTT TCAAAGAGGT TGCCCAGAGA 360 

AAAGCTGGGC CTCTTGCATT CGGCTTCCTT GGAGCAGCCT CTTCTGGCAG AAAGCCATCA 420 

60 GGTGCTCAAT CATCTTCTCC TGGCCAAGGC TCTGACCATG CTTAGTACTG GAATAGAGGT 480 
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GGCCAGGCCC CCAGCGACTC TTCTTGGCCT GATGTTTGTC CTCACAGGCA TGCCACGTGG 540 

CCTGAGATGA TTCAGAACAA ATCATGCTAA CTTTGAATCC ATCCAGCCAC TTGCAAATGA 600 

5 

TAATCAGAAG TCAGCTTGTT CACTGTTAGA AAGAAACTAA CAAAAGAGAA CCCAGAGCAA 660 

TCTAGAATCT TTGAGTGCTT GGCTTTCCAA GGATACTGCG GAGACTCTGG CCAAGCTGAT 720 

10 GAMCTTCTGA ARTGTCACTG GCACCATATG CAACAAGAAC CACCATTCAC TGAGTAGCTA 780 

ATGGGTTTGG GGCCTGGGAC ATTCCATCTG AGGTCCTTCC TGAACATGTC ACTCCACAGC 840 

AGAGGACCGG TTGCAGCTTA CCCAGAACCA CTCCTCCAGG AGAGCTGGAT GTTTTGCGTG 900 

15 

CAACACCTTG AGCACTGACT GCTATTGTTC AAAAAAAGCC TTTGCTGCAT TCGGAGGACT 960 

GCCCCGTGCC CTGAGGTGAC TTCCTAACTA TGTGGTTTCA TTAGCGAATT TATITTTTGT 1020 

20 GCTGGGTGGA CATTTGTATT TTGTTAGGTT GCTGTTTAAG CTCAAGTTTG CTGTGCTCTC 1080 

TGCAGCTACA AAACATCTTG GCATATTTAA GAKTGGCTTT TATAAATAGC TTTATTCTGA 1140 

TA1TAATCAG ATTCCCAACT TTACTGAGAA TTAAGGACTG GGGTACTTTA AAGAAATGCA 1200 

25 

AATAGCAATT GAAGAACCAC TGCTGCAGGT GGTAGCCCTG GCTAGACTGA ATTACACTAG 1260 

AAATCAGCCA GAAGGAAGCG TCCTTGGGAT CCCAGATCAC TCTTTTTTTT TITTTTTTTA 1320 

30 AAAGGGGCAG CCCCTTGATG GCTCATCTCT CTGAATAACA GTTACGTCTT CATATCGATA 1380 

CCAGATGCCT TCTTCATCAT GCCACTGAAG CCACTCACCA CCTTCAAGAA CATGCCAACC 1440 

TCTGTCAGAT TCACTTACCC ACAAACAAGG AGGCACGTTT GGCACAAAGT GTTGTCCTCC 1500 

35 

AGGTCCAAGT GGACTCTACA GAGTCCTTGA CXTTCAACACA CTGGATTCCA GGTGGACTGG 1560 

ACCAAGAGCA GGCAAAGACA CGGGAACTGA AAAACTCCAC AGGGTTTGGA GAATAGAAAT 1620 

40 GAAAAGCCAC GTCATATAAC TCAAGAATAA ATGGTGrTTT GGAAATTTTA AAATTATCAT 1680 

CGAAGGTGGT GAAACTATTT CAGGCCCAAA TGAAAGGAAA TCGCCAGTTG GGGATGAAAT 1740 

CACAGAGCCT GTGTTTTATG ATATQGTTGG ATGTCCACTG ATGAAATTTT AAAGGAGTTT 1800 

45 

CATTTTTAAA AGTGCGCATG ATTCTACATA TGAGAATTCT TTAGGCCAAG AAACTGTCCT 1860 

TGGCTCAGAG GTGTTGGGAA TTAAAGCAGA GAGAAGCCAT TCGTGATGCT TAGAACCAAG 1920 

50 GATGGTCATG TACACAAAGA CCATCGAGAC GGCCATTCTT GTTTACAAAA CACTTACCAA 1980 

GAAAGCACTT TGTAGGGGAA CTTTAGTAAG TTCTTCTCAT TTCATTATGT 1TCTTCCAAG 2040 

GAAACAGGAG AGACTGAATT AATAATTCTC TCTTTCCTCT TAAGCACTTT TAAAATAATA 2100 

55 

AAGTACATCT TGAAATTTGG GGGGGCATCT CTGATTTAAA AAAAGAAAAA GGCTGCTTGA 2160 

TGTATGTTAT GCAGAGACAC TCTGCCTCTG GTGGCTGCAG AGCAATACCC AAGCCTCATT 2220 

60 TGGAAGGCTC AACATriGGA ATTGCACTTT AATTGATTAA TCCTCAATTC ATGTGGCCTT 2280 
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ACGGGATGGT GGGTCTGGGA CCCCAATTCA TTCTTATCTG CCAAAGAATT ATCTAGAAGC 
ACATCAAATA CCAGCACCCC ACCTGCACAA TGGGGGTGGA AAACTTTTGT ATCCCTAAGC 
ATATTATTTT ATAGTGTCTG CCATGCCATG TGGAAATACT TTATTTTTAA CCTCAGGATT 
TAAATAAAGT AAACACTATG ACATTTAAAA AAAAAAAAAA AAAACTCGAG GGGGGCCCGG 
TACCCA 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
CACTGCACCA GCTTTGTTAT CTGTAAAATG ATGATAATAC CAACACCTTC TTCTTGGGGT 
ACTGAAGATG AGAGAACATG ATATGTGTAA AGTGCCTTCC ACAATACCCA GAACATAGCA 
AACATGTAAT GAATGTAGTA ATAGTAATTA TTTTATTTTC TTTTGATTCA GTTGGGACTA 
TGTTCAGCTG TAACAGAATA CCCAAAATAA CTGTTTTAAA CAAATTAAAG TTTWCTTCTG 
AAGTTTTGTT ACGAATTCAG ACAATCCAGG GCTTTTATAG ATGCACCAGG ATCAGCAGGT 
ACAAAGGCAT CTTTCCTGAT TTCTGCCAGT CTCAATGCAT GGGTTGCAAT CCAGARTCCA 
RGATGGCAGT TCCAGCCCTG GTTACGCCCA TATTAGCACA CAGAAAGAAA GAGAAAGGGA 
TGTGCCTCTT CACTTTAATC ATAGCTCCCA CTAGATGCAC CCACTACTTC TGCTGATACT 
CCATTAGCTA ATGCTTGCTT ACATGGTCAC ACTTAGTTTC CAGAGAGACA TGTCIGGACA 
GTCATGTGCT CAATTAATAT CCAAGTGTCC AATTACTGAG AAAAAAAGAA ACTAGCACCT 
TTGCTTGGTT GCATTCCTCT TAGCATAAGC CACATTCTTT TTATGAAGTT GTCCTCAGTT 
ACTTGGATGC CTCAGTTGTC CTTTCAWTTA GAAAWGCYCC TKGGACAYCC TGAAWCTGAC 
TTCTTTTGTC ATCAGCACCA TCACTACCAC TGCCYTCTTC AAAGCCACCA CGTTCTGTCC 
CCAGGATGGT TGCAACAACC ACCATAGGGA CTTTTTGCCT TCTACTTCCA CACAATAGNC 
CAGAGTAAGC TTTTGAAAAT GTAGGTCAGA TCATGTCTCT CTCTTCCTCT TCAAAACCCT 
CCCGATGGCT TTTCATATTA CTCAAAAGAA AACCTAAAAC TTTGCTGTGA GATCTATGTG 
ACCCGGCTTA TTCTTCCTCT TACTTTATCT CTGTATTGCT CTTCCTCACT CTACTCCAGC 
CATCCCACCT CCTTGCTGCT TGTCCTATAC TCCTAAAAGA AGTTCAGTCT TCCCTTATGA 
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TATTTOCACT TAAAATAGAA AAAAAAAAAA AAAAAAAACT CGAGGGGGGC C 1131 



(2) INFOEIMATION FOR SEQ ID NO: 13; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 941 base pairs 
\Q (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



15 



25 



35 



45 



50 



(2) INFORMATION FOR SEQ ID NO: 14: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 843 base pairs 
55 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 14: 



60 



240 
300 
360 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
GGCACGAGTA GCATTTCATT TAATCTGCAG GTATATTCTC CCAACAGTTT ATTGTCATGT 
GATGTCCTCA GCCAAGATTG TRAGGCAGAG AGGAGCTGTC CCAACCTACT ATACCACCGA 120 
20 GGCTGGAGAG ATCATATTTT TGGTATTAAA CTGGAGTCTC TCCATCCTTC ACATTGTrGA 180 
TGTCCTCTGT AGCAAACCGG AAAAGTCAGT GACAGAAGAT GCCGCTAGCG GTITGAGCCA 
GAGAATGACA GCTCTGGTTT GGAGAAAAGG GCCGGATGGT GGCTCTAGAA AGCCCATCCT 
TCTGCTCTTC TnTTTCTCC CCCTTATATT GTGCTTTCAT TCATTCATTC ATTCATCAAA 
CATITCITCA GCACCTATTA TCTGTCAAGC TCTGTGCTAG CCTCTGGAAA ACCTGCCCTC 420 
30 ATCTAGCTCA CTGTCGAGTA GGAGAAACAA TCACTACACT ATGATAAGCA CGGGTTGTCA 480 
GGGTCTCACA GAGCAGTGGC CCCTCATCCA GACCGATGAG GTCAAAGAAG GCATCCAGGC 
GAGGATGGTG TCAGAGCTAA CTCAAGAATG AGAGGGAGCT GCACCASCAG GGGTTGGAAC 
TCAAGGTCGC AGrTGCCTCGA GTCirGATTC CAGCAGAGGG AGAGCAGTCT GTGAAAAGGC 
ACCAAGGGTG GGAGAGGGCA GAGCACATGG AGGAACTTCA GGTAGTTCTG GATGGCSCTG 
40 GGGCAAAGCT AGAGAGCTAA GAAGAATCTA CAAATGTTCC TCGACTTACA TGAACTTCCA 
TCCCAATAAA CCCATTGGAA ACGAAAAATT TAAGTCAGAA GrGCATTTAA GGCIGGTCCG 
AGTAGAATCA TITITACAAC GAATTGATCA CAACCAGTTA CAGATGTCTO TCTTCCTTCr 
CCACTCCCAC TGCTTCACCT GACTAGCCTT TAAAAAAAAA A 



540 
600 
660 
720 
780 
840 
900 
941 



60 
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CNAGGGATAA CCCCAAAG^7^ GGGAAATAAA CCCTCAATTA AAGGGGGAAC CAA^^AGCTG 
GGAAGTTCCC CCCCGCGGTG GCGGCCNGNT CTAGGAACTA GTGGAATCCC CCGGGGCTGC 
AGGGAATTCG GCACGGAGTG GGAATGTTGT TTGTATGATA CTATTTCCAC AAWATGCATT 
GAGACTTGGT KTGTGGCCTA GGACATGGTC AATTCTTTYT AAATATTCCG TGAATTTCTT 
TAGTGCATAT TCTCCGATGG GGGCTGTGGG GACAGAGTTC TAAATATGCC CATTAGATTA 
AATCTCTTCA TTCTGTTGCT CACATCTTCT ATATCCTTAT TAATCTGTCA ATCTCTTCAA 
GAGAGGTGTT ATTAAAATCT CTCACTGTAT GTGTCACTTT GCCCTTAAAA TTCTCATGAT 
TTGCTTTATA AATGGTTATA ACCATTTTCC AGGAAGAACA TTAAAGAACT TTCCATTGGC 
ATTATCCAGT TTCCCTCAAA ATACTGGTTT TTTTTATTTT GGCTNCTAAG CAGCTATGAA 
TCCAGTTTCT CAGAAGCCCT TGTCTCAAGG CATTTGTTTC CAGATTACCT TGTTAGCATC 
CACACTATGG GCTATTTTAG AAAAACAAAA AAAGTATCAA AATCATATAG CTATGATTTT 
CCTGTGCTTG AAGGAGCCTT AAAGCTCATC TAGTCCAGCC AGTATTTGTT CATCCAAATT 
CTGCCAAGAA ATCTCTATTG TCAAGATATT CTTTACCATC TTTGGGACAT TCTCATTAIT 
AGAAACAAAT CCTAAGAAGA AATTCTGCCA TAKACAACCC ATCCGTTCTT TAAAAAAAAA 
AAA 

(2) INFORMATION FOR SEQ ID NO: 15: 

ii) SEQUENCE CHARACTERISTICS: 

(A) LENGTH:^ 1018 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOIiOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 15: 
CTGTAATTTT TAATTTTCAT ATACCGTGCT TTGATTCTAA TTTTATTTTT TGAGTTCTCT 
GAAGGTTACA TATACAGAGT GCTTCAGGAA TGATCATTTT GTTATTATTC ATGCTTCTTA 
ACAATGTTGT TTTAGTCCAA GAAGATAATT .GCCAGAGAAA GAATACAGTG CAGGAAAGAA 
GARGCTGGAG CCAGTGGTGA AGARGGATTG AGARGACAGA CATTGTGGGA ATGAAATCAT 
GAATAATCGT GTTTTTGAAT TGTCCAAAAA CTTCTACAAA CCATGAAATG TTGGAGTTTA 
AATCTAATTG TTGAAAAATT CCCCACATTC CTTGTATCCC TTAGGTTGAG CATAATTCCA 
CATCCGTGGA CTGATGCACT TCCCAAGAGG GGGCCTCATT AACTCTTCCG AGGCAGCAGC 
AGCAAGGGCA CCCCCTCCTT TCCCCCCACA CCCCAYTTCT CATGGCTCTT dTTCTCTCA 
TCTCATGCTT AGGTTAGAAA AGGGCACAAG GTAAGGAAGC CCTTGGGAAT AGGCTGAATC 
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TCGCTATCTA ATTTOGTCCC AAATACTTAA TOTCJCTTGAA TTTAAAAACA GCAAACATGT 
AGAAAGCTAA TTATAArTAT GAGGCCAGTT CTTTAAGCTA GCrrmTTC CCCTCTCAAA 
^ CAC3CATATTC GCTTOGATCT CAGCAGGAGA AAGTGTTTTT TGCAATACAC ATAATGCATA 
TATCOTCCTG TTAGCAATCT ATAGAAAATA GATATTGCTC ATTAAGGTAA ATAriTrTGT 
10 TCATGAATGA TCTOGAAIXIG TCTCGACTTG TTGTGTGAAC AGGAAATTGC TCTGTAGGCT 
TTCACTTCTG AGCTTAAAGAG TCAGGCTCGT AAGATTAATT AAAGTAAATA CTGTGACAAT 
AGGATCTCAA AACCAAAAAC GTCTITCTGA AACTCAAGGA ATTAATGACA CATAGGGAAG 
mi.3CCAT ATTAAGCATA GAGTAGGAGA GGCAAGTCAA GAATAAAAAA AAAAAAAA 1018 



20 



(2) -INFORMATION FOR SEQ ID NO: 16: 



TOGCAGGAGC GCAGGA^ AGCK^VTCCC CCGGGTTGCA CCCCCCCAGY TCTGCTGGAC 
35 ATAAGYTCGT TAACAGAGAG CCTGGGAGCT GGGCAGCCTG TACCK3TO^ GTGCCGGCAC 
CGCCTGGAGG TGGCTGGGCC AAGQAAGGGG CCTCTGAGCC CAGCATGGAT GCCIXXTCTAT 



CATGGAOTGT GT^AGGAGGT GGAGAGAGTT CGGCGCTCAG AGAGGTACCA GACCATCAAG 
OIGCGCAGGG CAGGGCTCGG ACCTACCCCA GGAATGTCCT GCCCTGGGAA TGACAACACA 
45 GTCCACACCA TOCACGGGGA GGCAAACAGG GGCAGCIGAC CCAGCCCAGG GGTCAGANGA 
GCTCTIGCCG AGGAAC^ AGCTAAGCTC ATACCTGATA IGCACWAGKC AGCCARGYGG 
AGACAGGCAA GGAAGAAGCT TOITTM ACAGAAnTT CTAGATCACT CAGCACCATC 
TGGCrrrrGG GGCrrmGT TTTAm^ rmGAGACG GGGTCTCGCT CTGICGCCCA 660 

661 



N 



55 



(2) INFORMATION FOR SEQ ID NO: 17: 
50 (i) SEQUENCE CHARACTERISTICS: 



600 
660 
720 
780 
840 
900 
960 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 661 base pairs 
22 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
TTTAAGAAAT TAGTGAAl^C CCGGNTGCAG GGAATTCGGC ACGAGGAGGA GGCCGTCAGC 60 

120 

180 
240 

GCCTGCCAGC GCCCTACGCC CCTCACACAC CACAACACTG GCCMTCCGA GCTGCIGGAG 300 

40 

narrATGAAG 360 



420 
480 
540 
600 
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(A) LENGTH: 553 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
GGCACAGGGC TATTTGCCCC TCTCTCCACA TGACAGAACT GCTCTAAGTT TCTTTGCTGC 
TCTTCTCAGC TGTCAGACGG CTTGCTGCTT GTTTTCCACA CCACCATGTC TATTCTTTGC 
TGTCCTTWAC TCTGCCTGTT TTTTTCCTTT TGTATTTCTT CTGGCTCTTG TCCCTTTTCC 
CACGTGTCWC AGCTTTCCTT TATTGCCACT TTCAGTCAGA GCAGTCCTGT GCTTCTGGTG 
CCGGCATACA ATACTTACTT GAGTTTCTTG GCTTTTCTTG ACTGTGCATC TCTTACTTCA 
ACATAGGAAT AGCCTGTCAT AGAATTTCTC CAGTTCCAGG GCTCAAGAGG GAGAGTGCCA 
GAAAATTGAG ACTGTTTTCC CTGTCTTGGA TTGAATTCAT AAAGCAAAAC CAGTGTTTGT 
GTCAGGGTTT GCTGTGTCAT GCCTATAGGT TGTTTGGGTG CAAACCTATA GAATCCAGCC 
TGCGAAAAGA AAGRAACCAG AGAATANCAG CATCAGAACA ATGCTTGACA TCATTTCTCA 
ATCAAGCAGT CCA 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 869 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
GGCACGAGCT GCCAACACTG AGGTCTTCGT GGCTTCTCAC ATCTAGATGT ATCCCTCTCA 
AATCTATCCT CTATCCAGGC ACCAGATTGA GGTATCTAAA ATGTCAACTT TCCAGTTACT 
CCTTCTTATA CTAGCCCAAT CAACTTACAA GATAAAGTCC AAGCCCCTTC ATATGACAAA 
CCACACCCTG CTTAACTCTC CAGGTTTGAA TCCTTCATCT CCTACTTTAA ACTTTAAAAC 
CCAGCAGCAC GAAAGTGTCT CCTATGCATG TTGCCATATG CGTTCTCTCC ATCATGCATT 
TGCCTGAGCA AGATGTCTTG AGTTAACATC TTATTCTTTA AGACTCATTG TGGTGGTAGA 
CAGCCTTTAA TAACGGATCC TTGGCCAGGC ACAGTGACTC ACACCTGTAA TCCCAGAACT 
TTGAAAGGCC AAAGAAGGAA GAAAGCTTGA GGCCAGTAGT TTGAGACCAG CCTGGGAAAC 
AGAGAGATAT CCCATCTGTA CCAAAAATTT AAAAAAATAT TAGCAGGGAG TAGTGGCATG 
CACAAGTGGT CCCAGCTCCA TGGGAGASTG AGGTAGGAAC ATCACTTGAG CCCAGGAAGT 



wo 98/54963 



276 



PCT/US98/11422 



CAAGGCTGCA C3TGAACCATG ATCAGAACAT TGCAOTCCAG CTTGCSGTAAC AGAGTGAGAC 
CTTAGGTCAG AAAAATGAAT AAATAAGCAT AAAATTTTAA AAACTTAGCC AGGCATQGTG 
5 GCACACATCT GTCCTCCCTG CTACTTAGGA GGCTGAGGTG AGAGGATCCT TGAGCCCAGG 
AGGTCAACAC TACAGTCAGC TATGATTGTO CCACTAAACT CCAACCTCGG TGAAAAAGCA 
AAACCCTGCC AAAAAAAAAA AAAAAAACT 

10 



660 
720 
780 
840 
869 



15 



(2) INFORMATION FOR SEQ ID NO: 19: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
20 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

GGCGAGCCGA GATCGTCCCA TTCCACICCA GCCIGGGCAA CAAGAGTGAA ACTCTGTCTC 

AAAAAAAAAA AATTATAATA CTATAIGCCA TAAAATGACA rrTCATATTT AAAGACmT 

TTAAAACTCT TCTATTCACA TGCCATAATT TGAAACCCTA TTTCACTGAA TGAGAATGGT 

A1CTOTIGTC CTXZATTrm CArTTTTATC CTTAACAATT TCCACCACAG CCAGTGCATA 

TAATGGCAAT GACACCCAGG GATGGAATGA TAAGTTCCAT CRCMGCTCAG 7CAAGACGCA 

GACriGATGT GGCCCCAACA ACAGTCAATA ATGGAGTCTC CAAAATAAAG CTCTATAGGA 

AAGGTAAATA CCCGCTCCAC AAGAAACCAC AGCATCTAGG TTCTAACCCC ATCTCTATCA 

AGAGCrrcCT GGGAGAGTTT TGA^AITOAA CAATCTGTCT GATKGCCAAT rrTYTrCITC 

40 TATAAAAT<5A TAAl^GA YTCAAAGATC CAAAGTCAAT TCATGGTCTA AAACTTAATG 

AirmrrAG cmrcKGAC AmcAcrGT acacigtagt aatttatatc 'rrArrrrccc 

ACTAATTTAG AAAAATATYT AAATOATCCT TAATIGGCAA TGGGTCCTAA GAATTTlGrr 
TTAAATCCCr GrTACCCAAA AGAGCCCTTT TTrGTATCTC GCAGTACrTA CAAGGATCTT 
TCTAAATCIT AAAAAAAAAA AAAAAAGAAA GAAAGAAAAG AAAAGAAAAA AAGTCAGCCG 
GGCGTGGTGG CTCATGCCTG TAATCCCAGC ACTTIGGGAC CAAGGTGGAC AGATCACGAG 
CJICAGGAGAT GGAGACCATC CCGGCCAACA TGGAGAAACC CTGTCTCTAC TAAAAAAAAA 
AAAAACTCGA GGGGGGCCCG CTACCCAATN CGCCGGCTAG TGG^GTAAA ACAATCAAA 

55 



25 



30 



35 



45 



50 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

959 



60 



(2) INFORMATION FOR SEQ ID NO: 20: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LE^3GTH: 1446 base pairs 

(B) TypE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

CGGGGCAGGG CTGTGTGGCA CCGCCAGGGA GCGGGCCCAC CTGAGTCACT TTATTGGGTT 60 

CAGTCAACAC TTTCTTGCTC CCTGmTCT CTTCTGTGGG ATGATCTCAG ATGCAGGGGC 120 

TGGTTTTGGG GTTTTCCTGC TTGTGCCAAG GGCTGGACAC TGCTGGGGGG CTGGAAAGCC 180 

15 CCTCCCTTCC TGTCCTTCTG TGGCCTCCAT CCCCTCATGG GTGCTGCCAT CCTTCCTGGA 240 

GAGAGGGAGG TGAAAGCTGG TGTGAGCCCA GTGGGTTCCC GCCCACTCAC CCAGGAGCTG 300 

GCTGGGCCAG GACCGGGAGA GGGAGCACTG CTGCCCTCCT GGCCCTGCTC CTTCCGCAGT 360 

20 

TAGGGGTGGA CCGAGCCTCG CTTTCCCCAC TGTTCTQGAG GGAAGGGGAA GGAGGGGGTC 420 

TTCAGGCTGG AGCCAGGCTG GGGGTGCTGG GTGGAGAGAT GAGATTTAGG GGGTGCCTCA 480 

25 TGGGGTGGGC AGGCCTGGGG TGAAATRAGA AAGGCCCAGA ACGTGCAGGT CTGCGGAGGG 540 

GAAGTGTCCT GAGTGAAGGA GGGGACCCCC ATCCTGGGGG ATGCTGGGAG TGAGTGAGTG 600 

AGATGGCTGA GTGAGGGTTA TGGGGAGCCT GAGGTTTTAT GGGCCTGTGT ATCCCCTTCT 660 

30 

CCCGGCCCCA GCCTGCCTCC CTCCTGCCCG CCTGGCCCAC AGGTCTCCCT CTGGTCCCTG 720 

TCCCTCTGGT GGTTGGGGAT GGAGCGGCAG CAAGGGGTGT AATGGGGCTG GGTTCTGTCT 780 

35 TCTACAGGCC ACCCCGAGGT CCTCAGTGGT TGCCTGGGGA GCCGGACGGG GCTCCTGAGG 840 

GGTACAGGTT GGGTGGGCCC TCCCTGAGGG TCTGGGGTCA GGCTTTGGCT CTGCTGCCTC 900 

TCAGTCACCA AGTCACCTCC CTCTGAAAAT CCAGTCCCTT CTTTGGATGT CCTTGTGAGT 960 

40 

CACTCTGGGC CTGGCTGTCG TCCCTCCTCA GCTTCTTGTT CCTGGGACAA GGGTCAAGCC 1020 

AGGATGGGCC CAGGCCTGGG ATCCCCCACC CCAGGACCCC CAGGCCCCCT CCCCTGCTGC 1080 

45 TTTGCGGGGG GCAGGGCAGA AATGGACTCC TTTTGGGTCC CCGAGGTGGG GTCCCCTCCC 1140 

AGCCCTGCAT CCTCCGTGCC STAGACCTGC TCCCCAGAGG AGGGGCCTTG ACCCACAGGA 1200 

CGTGTTCGTGG CGCCTGGCAC TCAGGGACCC CCAGCTGCCC CAGCCCTGGT CTCTGGCGCA 1260 

50 

TCTCTTCCCT CTTGTCCCGA AGATCTGCGC CTCTAGTGCC TTTTGAGGGG TTCCCATCAT 1320 

CCCTCCCTGA TATTCTATTG AAAATATTAT GCACACTGTT CATGCTTCTA CTAATCAATA 1380 

55 AACGCTTTAT TTAAAGCCAA AAAAAAAAAA AAAAAACTCG AGGGGGGGCC CGTACCCAAT 1440 

TCGCCA 1446 

60 
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(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 1471 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOIjOGY: linear 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

CAAAAAATAA TAATGATAAT TTAAAATAAA TAAGTAACTA ATAAAAAGAT TTTATATCCC 60 

AGTCTTATGA TGTTCGTrGG CAAGGCTAGA TAAAAAGATG TTAGAATGAA AGAACATATT 120 

TTTAGTGATA TGTAAATCAA GGATTCTACA ATAGTCATAT ATTTTTATAT GAATGAATCT 180 

TGGGTTGGGC TGGAGAGGTA TGTGTGTGTA AATATAAAGG TCTCACATTC AGAGTATAGC 240 

20 TCTGAAATAA TGGAACTCAT GTCTACAATT CAACATGCAT CTGTATAGTT ACATCTCATG 300 

TAAATATACA CAGACATATT TTCCAGCCAG TAATTGACAG TTAATGTCCA AAACAGGTGA 360 

TTCATAGGTA ACAGAAATTA GATAACCACC AATTTTGCCC AAGAGAAAGA CTAGAAGGAC 420 

25 

TAAAAGCAGT TGAATGTATG GTACTGACAT TGTCATAAGC AGTCTGATAA CCAGTTTATT 480 

GAAACGTOTG CATTAACAGA GAATTTAATT TTAAACCCAT AATTTCTCCT ATCCATTAAA 540 

30 ATATTATAAT TGTTAGTAGT ATGAAACCAA CAGGAAATGT TTTTTAATCA TTTAGTGAGG 600 

TGATTCATTT GTrTCAIGGG CAAACACTAT CCAGGAAAAG CCTTGCTTGC CTGTTTCCCA 660 

AAGAGCTCTA AGAAATAGAA TCAAGTCTAA AATGGTTCAG ACCATTCAGG ATTTCTTGTC 720 

35 

ACTCTTCTCA ACCCCGATCT TCCIGTTATT ACTGATGTTT GAAACCCTGT CATTAGCCCC 780 

GGCCTCGTTA AAGCCCCTCA GAGTCACCTC TCATTCATAG CAATAGAATT CAACCCCAAG 840 

40 TGGTTCATGG TGTCCCCAGC ACAGCCGAGA GACCTGATCT CrCGAITCAG TGCTTTTAGC 900 

TCTTCGAGTT TACCCTAAGA TACCTTCGGG CAATATTm AACCAACCCA AAAGCTCTTC 960 

AGGTCATTTC TGAAGAGGAC AAGGTX3AATC TTGGCTTGGA ACACCATTTT TGGGCTCTTG 1020 

45 

CTACTGAATG AATCAGAAAG GAATnTTTC TGAAGAGCAT TAGAAAGTAA AGGAGATGTT 1080 

AAAATAAGTT CTTGAAGTAT GTTTTATATT TATCTAAAAC ACTGATTTTA AAAGTITACA 1140 

50 rrCAAATCTG TATTCAAAAG AAGTACTGAT TrGTAATTAT TATAGTTTGT GTGTATCATC 1200 

CCCTTTTAAC CGTGCCTAAC AACTGTACTT AAATrTTGTT rrCCTAGTGT AACAAATGTT 1260 

TCCCATAAGA TTTTCTAGAG CCAAATAATG GGAGTGAAAA ATTCCTTAAG TGTTATATAA 1320 

GAAAATATAT TAGAAAATCA GdTTGGATT ATACGATTTC TAAAATATAC TAATACAGAA 1380 

TCCTCAGTAA TATGnTTGA ATrGGATTTT TTCTCAGAAC TGTTACATAA TAAATAATAC 1440 

60 ATCAACCAGA AAAAAAAAAA AAAAAAATTN C ^^"^^ 
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(2) INFOKMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1402 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
AGGGACGTCT TGCCTGAGGA GATGCCCATT TCTGTCCTGG RTTACCCTCA CTGCGTGGTG 
CATGAGCTGC CAGAGCTGAC GGCGGAGAGT TTGGAAGCAG GTGACAGTAA CCAATTTTGC 
TGGAGGAACC TCTTTTCTTG TATCAATCTG CTTCGGATCT TGAACAAGCT GACAAAGTGG 
AAGCATTCAA GGACAATGAT GCTGGTGGTG TTCAAGTCAG CCCCCATCTT GAAGCGGGCC 
CTAAAGGTGA AACAAGCCAT GATGCAGCTC TATGTGCTGA AGCTGCTCAA GGTACAGACC 
AAATACTTGG GGCGGCAGTG GCGAAAGAGC AACATGAAGA CCATGTCTGC CATCTACCAG 
AAGGTGCGGC ATCGGCTGAA CGACGACTGG GCATACGGCA ATGATCTTGA TGCCCGGCCT 
TGGGACTTCC AGGCAGAGGA GTGTGCCCTT CGTGCCAACA TTGAACGCTT CAACGCCCGG 
CGCTATGACC GGGCCCACAG CAACCCTGAC ITCCTGCCAG TGGACAACTG CCTGCAGAGT 
GTCCTGGGCC AACGGGTGGA CCTCCCTGAG GACTTTCAGA TGAACTATGA CCTCTGGTTA 
GAAAGGGAGG TCTTCTCCAA GCCCATTTCC TGGGAAGAGC TGCTGCAGTG AGGCTGTTGG 
TTAGGGGACT GAAATGGAGA GAAAAGATGA TCTGAAGGTA CCTGTGGGAC TGTCCTAGTT 
CATTGCTGCA GTGCTCCCAT CCCCCACCAG GTGGCAGCAC AGCCCCACTG TGTCTTCCGC 
AGTCTGTCCT GGGCTTGGGT GAGCCCAGCT TGACCTCCCC TTGGTTCCCA GGGTCCTGCT 
CCGAAGCAGT CATCTCTGCC TGAGATCCAT TCTTCCTTTA OTTCCCCCAM CCTCCTCTCT 
TGGATATGGT TGGnTTGGC TCATTTCACA ATCAGCCCAA GGYTGGGAAA GCTGGAATGG 
GATGGGAACC CCTCCGCCGT GCATCTRAAT TTCAGGGGTC ATGCTGATGC CTCTCGAGAC 
ATACAAATCC TTGCCTTTGT CAGCTTGCAA AGGAGGAGAG ITTAGGATTA GGGCCAGGGC 
CAGAAAGTCG GTATCTTGGT TGTGCTCTGG GGTGGGGGTG GGGTGTTTCT GATGTTATTC 
CAGCCTCCTG CTACATTATA TCCAGAAGTA ATTGCGGAGG CTCCTTCAGC TGCCTCAGCA 
CTTTGATTTT GGACAGGGAC AAGGTAGGAA GAGAAGCTTC CCTTAACCAG AGGGGCCATT 
TTTCCTTTTG GCTTTCGAGG GCCTGTAAAT ATCTATATAT AATTCTGTGT GTATTCTGTG 
TCATGTTGGG GITTTTAATG TGATTGTGTA TrCTGTTTAC ATTAAAAAGA AGCAAAAATA 
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ATAAAAAAAA AAAAAAAAAA CT 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1047 base pairs 
JQ (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



1402 



15 



25 



(xi) SEQUENCE DESCRIPTIC»J: SEQ ID NO: 23: 
GGCACAGGGG ACTACAGGCA CCCACGACCA TACCCAGCTA ATmTGTAT rrmTGTAG 60 
AGATCGGGTT TCACGATCTC GCCCAGGCTC GTCTTGAACT CCTGGGCTTG AGCGATCTTC 120 

20 ccATcnrcc atcitcgcct cctaaagtgc tcggactcca GGCATGAGCC ACCATCCCCA 180 

GCCAAGATTC TTATrGATrA CCATGTTCCT -TCAAGAAGCC AAGCCAGTTT CCAATATTCC 240 
CCATTTGCTC GAGTCTTCGT ACTTTGGGTA GAAGCAACTG GTAAATTCrr AATTGGAACA 
mTGCTGcyro TAGATAACCA CGTATGGCCA AACCTAGAGC ATCTAGGCTC ACAATTACTA 
TCCTGACTTG ATAACAAGTG TTCTGATATT AACCTGAAAA TGGGAATAAT GCCAAATCTG 
30 TOrAACTTAA CATCTATATA CACAGTGGGG AGAACTGAAG TTATTAAACC TGGAATCTCT 
CTTCATCAAGG CTAACAGTAG TTATCTAAGA AGCAAAGGAC CTACAATTCT TAGACITCGA 
GTCATATTCT TTAAGGACGT GTTCTCAAAC TATATCAAGC ATCTGGTTTC CACGTATTTC 
TCCCTCAGAA ATTATGAAGT ACAAGTAAAA ATGAAGGTAC AGGGTAAGAC ACATGCTCCT 
TTCTIGCrcT TGAffTCGAGA CACmTCCA GCCATCTTAA CCCCTTWACA CAAAACAATT 
40 TCTOTTTAT AGCAAATAAG TXSACTCAACA TAATTTCAAT ATGATGTTTA TCCACCAGTA 

crrrccrrrc agcttctagt cccataartg gtttgtcaag tcatcggtta cattagccaa 

GATAGGCCTA GACITCAAGT CTAGAATCTT TITCCCACTA TATCCCAAAG TAGAATGTCG 
CTATCTCAGG CrrCATmrc TXTSTTCAATT TCCCACCTGT ACAGTTGTrA TCATTCACrr 
TCCTTArorG TCTAATAAAT CITCTTCCAT GAAATGATCA AAAAAAAAAA AAAAAAAACT 
50 CGAGGGGGGG CCCGGTACCC AAATCGC 



35 



45 



300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1047 



55 (2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 990 base pedrs 

(B) TYPE: nucleic acid 
50 (C) STRANDEDNESS: double 
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(D) TOPOLOGY; linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
TTGGAAAGGG TCTAGCTCTT TCTCATTCAC CAACTATATT AGAAGCACTT GAGGGAAATT 
TACCACTCCA AATCCAAAGC AATGAACAGT CrTTTCTGGA TGATTTTATT GCCTGTGTCC 
CAGGATCAAG TGGTGGAAGG CTTCCAAGGT GGCITCAGCC AGATTCATAT GCGGATCCTC 
AGAAAACATC TTTGATCCTG GAATAAGGAT GATATTCGTT GTGGTTGGCC TACCACCATA 
ACTGTTCAAA CAAAAGACCA GTATGGGGAT GTGGTACATG TTCCCAATAT GAAGGTAATT 
ATAACTGGAT TAAATTAGCA GACATCTATA TACTGGCTGC AATGACTGAT AAAATTTTAG 
AAATGCCAAG TGCTGAGRGT CCATTTGTTC TACCCTCTTT ATATAAAGGG TGATGCTGAA 
AGTTTGTTTA AATGACTTGT TTATATTAAT TAGTCCCCAA GTGTCCAAGT TACACCTGTT 
TTTTTTGTGA GTTTGTTCTT TACATTTTGC TACCTGITAC GGGGACTCAA AGGAGGGATA 
AGAAAGTATC CATCTAAAGA GTGCTAGACA CATACAGTGA AGCCCCTCAA TATGTATTGA 
TTGAATAAAT GCATGAAAGA ATACATTTTT AAATTTTGTG TATAGTTTTG AAAGACTCAA 
GTACGTTCTG TGTTTGGTAT TACTGAAACC ACATTTTAAA AATAACACTC ATTAAGTTAG 
AAATATATGA GTTTAGATTG TAAAAGAATG AGGAATTGAA ATAGTTGTAT ACCATATTGA 
TGAATATAGA GTTTTTAGGA TACCTCTTAC CTGAAATATT AATAATAATG TTTNCAGAGC 
ATATTATACA TAATTATTTG TGATTTAATC TGTTAATATG AATATCTCAT TTAAAACTTT 
TATTTCTGAA AAAATTATAT TGAATAAAAT TTTATATAGG CAGTCCCCAG CCCTTTCCTC 
CTTCAAAGTT GTCTTATAGA GTGATTGGTT 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENC5TH: 1208 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
TAATCGCTAC TATAGGGAAA GCTGGTCGCT GCAGGTACCG GTCCGGAATT CCGGGTCGAC 
CCACGCGTCC GAGCGAAATG GCGCCTCCGG CCCCCGGCCC GGCCTCCGGC GGCTCCGGGG 
AGGTAGACGA GCTGTTCGAC GTAAAGAACG CCTTCTACAT CGGCAGCTAC CAGCAGTGCA 
TAAACGAGGC GCASGGGTGA AGCTRTCAAG CCCAGAGAGA GACGTGGAGA GGGACGTCTT 
CCTGTATAGA GCGTACCTGG CGCAGAGGAA GTTCGGTGTG GTCCTGGATG AGATCAAGCC 
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CTCCTCGGCC CCTGAGCTCC AGGCCGTGCG CATGTTTGCT GACTACCTCG CCCACGAGAG 360 

TCGGAGGGAC AGCATCGTGG CCGAGCTGGA CCGAGAGATG AGCAGGAGCK TGGACGTGAC 420 

^ CAACACCACC TTCCTGCTCA TGGCCGCCTC CATCTATCTC CACGACCAGA ACCCGGATGC 480 

CGCCCTGCGT GCGCTGCACC AGGGGGACAG CCTGGAGTGC ACAGCCATGA CAGTGCAGAT 540 

10 CCTGCTGAAG CTCGACCGCC TGGACCTCGC CCGGAAGGAG CTGAAGAGAA TGCAGGACCT 600 

GGACGAGGAT GCCACCCTCA CCCAGCTCGC CACTGCCTGG GTCAGCCTGG CCACGGGTGG 660 



TGAGAAGCTG CAGGATGCCT ACTACATCTT CCAGGAGATG GCTGACAAGT GCTCGCCCAC 

15 

CCTGCTGCTG CTCAATGGGC AGGCGGCCTG CCACATGGCC CAGGGCCGCT GGGAGGCCGC 



25 



35 



50 



(2) INFORMATION FOR SEQ ID NO: 26: 



720 
780 



TGAGGGCCTG CTX3CAGGAGG CGCTAGACAA GGATAGTCGC TACCCRGAGA CGCTGGTCAA 840 
20 CCTCATCGTC CTOTCCCAGC ACCTKGGCAA GCCCCCTGAG GTGACAAACC GATACCTGTC 900 
CCAGCTGAAG GATGCCCACA GGTCCCATCC CTTCATCAAG GAGTACCAGG CCAAGGAGAA 960 
CGACTTTGAC AGGCTGGTGC TACAGTACGC TCCCAGCGCT GAGGCTGGCC CAGAGCTGTC 1020 
AGGACCATGA AGCCAGGACA GAGGCCAGGA GCCAGCCCTG CAGCCCTCCC CACCCGGCAT 
CCACCTCCAT CCCTCTGGGG CAGGAGCCCA CCCCCAGCAC CCCCATCTGT TAATAAATAT 
30 CTCAACTCCA RGGTCTTCCA CCTGAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
AAAAAAAA 



1080 
1140 
1200 
1208 



60 
120 



(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 1922 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY; linear 

45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

GTCCTGCGCT ACTGAGCAGC GCCATGGAGG ACTCTGAAGC ACTGGGCTTC GAACACATGG 
GCCTCGATCC CCGGCTCCTT CAGGCTGTCA CCGATCTGGG CTGGTCGCGA CCTACGCTGA 
TCCAGGAGAA GGCCATCCCA CIGGCCCTAG AAGQGAAGGA CCTCCTGGCT CGGGCCCGCA 180 
CGGGCTCCGG GAAGACGGCC GCTTATGCTA TTCCGATGCT GCAGCTCTTG CTCCATAGGA 240 
55 AGGCGACAGG TCCGGTGGTA GAACAGGCAG TCAGAGGCCT TGTTCTTCTT CCTACCAAGG 300 
AGCTGGCACG GCAAGCACAG TCCATGATTC AGCAGCTGGC TACCTACTGT GCTCGGGATG 360 
TCCGAGTCGC CAATOTCTCA GCIGCTCAAG ACTCAGTCTC TCAGAGAGCT GTCCTGATGG 420 

60 
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10 



AGAAGCCAGA TGTGGTAGTA GGGACCCCAT CTCGCATATT AAGCCACTTG CAGCAAGACA 480 

GCCTGAAACT TCGTGACTCC CTGGAGCTTT TGGTGGTGGA CGAAGCTGAC CTTCTTTTTT 540 

CCTTTGGCTT TGAAGAAGAG CTCAAGAGTC TCCTCTGTCA CTTGCCCCGG ATTTACCAGG 600 

CTTTTCTCAT GTCAGCTACT TTTAACGAGG ACGTACAAGC ACTCAAGGAG CTGATATTAC 660 

ATAACCCGGT TACCCTTAAG TTACAGGAGT CCCAGCTGCC TGGGCCAGAC CAGTTACAGC - 720 

AGTTTCAGGT GGTCTGTGAG ACTGAGGAAG ACAAATTCCT CCTGCTGTAT GCCCTGCTCA 780 

AGCTGTCATT GATTCGGGGC AAGTCTCTGC TCTTTGTCAA CACTCTAGAA CGGAGTTACC 840 

15 GGCTACGCCT GTTCTTGGAA CAGTTCAGCA TCCCCACCTG TGTGCTCAAT GGAGAGCTTC 900 

CACTGCGCTC CAGGTGCCAC ATCATCTCAC AGTTCAACCA AGGCTTCTAC GACTGTGTCA 960 

TAGCAACTGA TGCTGAAGTC CTGGGGGCCC CAGTCAAGGG CAAGCGTCGG GGCCGAGGGC 1020 

20 

CNAAAGGGGA CAAGGCCTCT GATCCGGAAG CAGGTGTGGC CCGGGGCATA GACTTCCACC 1080 

ATGTGTCTGC TGTGCTCAAC TTTGATCTTC CCCCAACCCC TGAGGCCTAC ATCCATCGAG 1140 

25 CTGGCAGGAC AGCACGCGCT AACAACCCAG GCATAGTCTT AACCTTTGTG CTTCCCACGG 1200 

AGCAGTTCCA CTTAGGCAAG ATTGAGGAGC TTCTCAGTGG AGAGAACAGG GGCCCCATTC 1260 

TGCTCCCCTA CCAGTTCCGG ATGGAGGAGA TCGAGGGCTT CCGCTATCGC TGCAGGGATG 1320 

30 

CCATGCGCTC AGTGACTAAG CAGGCCATTC GGGAGGCAAG ATTGAAGGAG ATCAAGGAAG 1380 

AGCTTCTGCA TTCTGAGAAG CTTAAGACAT ACTTTGAAGA CAACCCTAGG GACCTCCAGC 1440 

35 TGCTGCGGCA TGACCTACCT TTGCACCCCG CAGTGGTGAA GCCCCACCTG GGCCATGTTC 1500 

CTGACTACCT GGTTCCTCCT GCTCTCCGTG GCCTGGTRCG CCCTCACAAG AAGCGGAAGA 1560 

AGCTGTCTTC CTCTTGTAGG AAGGCCAAGA GAGCAAAGTC CCAGAACCCA CTGCGCAGCT 1620 

40 

TCAAGCACAA AGGAAAGAAA TTCAGACCCA CAGCCAAGCC CTCCTGAGGT TGTTGGGCCT 1680 

CTCTGGAGCT GAGCACATTG TGGAGCACAG GCTTACACCC TTCGTGGACA GGCGAGGCTC 1740 

45 TGGTGCTTAC TGCACAGCCT GAACAGACAG TTCTGGGGCC GGCAGTGCTG GGCCCTTTAG 1800 

CTCCTTGGCA CTTCCAAGCT GGCATCTTGC CXCTTGACAA CAGAATAAAA ATTTTAGCTG 1860 

CCCCAAAAAA AAAAAAAAAA AAAAAAACTC GAGGGGGGGC CCGTACCCAA TTCGCCCTAT 1920 

AA 1922 



50 



55 



(2) INFORMATION FOR SEQ ID NO: 27: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1951 base pairs 
60 (B) TYPE: nucleic acid 
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(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTIC»I: SEQ ID NO: 27: 
^ TCCTCCCCAG AGCGC3GCTGA GCCCCAC3GCG SAC3GGTGGCG GGGGAGCCTC GGGGAGCCGC 60 
CGCCACCTCC ACGGGCCTCT CTGAGCTCGG ACACCAGCGC CCTGTCCTAT GACTCTGTCA 120 

10 ACTACACGCT GCTTOGTAGAT GAGCATCCAC AGCTGGAGCT GGTGAGCCTG CGCCGTGCTT 180 
CGGAGACTAC AOTCACGAGA GTCACTCTCC CACCGTCTAT GACAACTGTG CCTCCGTCTC 240 
CTCGCCCTAT GAGTCGGCCA TCGGAGAGGA ATATGAGGAG GCCCCGCGGC CCCAGCCCCC 300 
TCCCTCCCTC TCCGAGGAAC TCCACGCCTC ATGAACCCGA CGTCCATTTC TCCAAGAAAT 360 
TCCTCAACGT YTTCATGAGT GGCCGCTCCC GCTCCTCCAG TGCTGAGTCC TTCGGGCTGT 420 

20 TCTCCTGCAT CATCAACGGG GAGGAGCAGG AGCAGACCCA CCGGGCCATA TTCAGGTrTG 480 
TGCCTCGACA CGAAGACGAA CITCAGCTGG AAGTGGATGA CCCTCTGCTA GTGGAGCTCC 540 
AGGCTCAAGA CTACroGTAC GAGGCCTACA ACATCCGCAC TGGTGCCCGG GGTGTCTTTC 

25 

CTCCCTATTA CGCCATCGAG GTCACCAAGG AGCCCGAGCA CATGGCAGCC CTGGCCAAAA 
ACAC?TCACTC GC?rGGACCAG TICCGGGTGA AGTTCCTGGG CTCAGTCCAG GTTCCCTATC 
30 ACAAGGGCAA TGACGTCCTC TCTGCTGCTA TGCAAAAGAT TGCCACCACC CGCCGGCTCA 
CCGTGCACTT TAACCCGCCC TCCAGCTGTG TCCTGGAGAT CAGCGTGCGG GGTGTGAAGA 
TAGGCGTCAA GGCCGATCAC TCCCAGGAGG CCAAGGGGAA TAAATGTAGC CACTTTTTCC 
AOTTAAAAAA CATCTCTTTC TGCGGATATC ATCCAAAGAA CAACAAGTAC TTTGGGTICA 960 



600 
660 
720 
780 
840 
900 



1020 
1080 



TCACCAAGCA CCCCGCCGAC CACCGGTTTG CCTGCCACGT CTTTGTGTCT GAAGACTCCA 
40 CCAAAGCCCT GGCAGACTCC GTGGGGAGAG CATTCCAGCA GTTCTACAAG CAGTTTGTGG 

AGTACACCTC CCCCACAGAA GATATCTACC TCGAGTAGCT GTGCAGCCCC GCCCTCTGCG 1140 
TCCCCCAGCC CTCAGGCCAG K3CCAGGACA GCTGGCTGCT GACAGGATGT GGCACTGCTT 

45 

GAGGAGGGGC ACCTCCCACC GCCAGAGGAC AAGGAAGTGG GGCGCTGGCC CAGGGTAGGG 



GAGGGTCGGG CAATCGGGAG AGGCAAATGC AGTTTATrGT AATATATGGG ATTAGATTCA 
50 TCTATCGAGG GCAGAGTCGG CTOCCTGGGG ATTGGGAGGG ACAGGGCTTG GGGAGCAGGT 
CTCTCGCAGA GAAGGATCTC CGTTCCAGGA GCACACGGCC CTGCCCCATC CTGGGCCTTA 
CCTCCCCTCC CAGGGCTCGG GCGCTGTGGC TCCTGCCTTG ATGAAGCCCG TGTCCTGCCT 
■ TGATGAAGCC TtTTGCCACCT GCAAGTGCCC GCCCTGCCCC TGCCCCAACC CCCACCGAAG 1560 

AGCCCTCAGC TCAGGCTCAG CCCAGCCACC TCCCAAGGAC TTTCCAGTGA GGAAATGGCA 1620 
60 ACACC?reGAG GTCAAGTCCC TCTTCTCAGC TCCGTCATCT GCGGGGCTTC TGGGTGGCTC 1680 



1200 
1260 
1320 
1380 
1440 
1500 
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CTGCCACTGA CCTCACCGGC ATGCTGGCCT GTGGCAGGCC TAGGACCTCA GGCGGGGAGG 1740 

AGGAGCTGCC GCAAGGCCCT GTCCCAGCAG AAGAGGGAGG CTTCCTGACT GACACAGGCC 1800 

5 

AGCCCCATCT TGGTCCTGTC ACCCTGGCCC CAACTATTAA AGTGCCATTT CCTGTCAAAA 1860 

AAAAAAAAAA AAAATCGGGG GGGGCCCC5GA ANCCAATTTC CCCCAAAAAG GGGGGTTATA 1920 

10 AAAATTCCCN GGCNGTGTTT TTAAAAATTC G 1951 

15 (2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3989 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

25 GGCACAGGCC GCAGGGNACC TATGGGCGCA TATAGGTTGT AATGAAACTG TAGTCTCAGT 60 

TGGAAGCCTA GACATGAAAT GGGTCAGTGA GCAAGGCTCT ATTCCTAGTC TCCAGCCATG 120 

CCTGTGGAAC CTCARCCCRC TCTCAGCACA TTGGACCCAG GCAGATGYAA AAAATTCACA 180 

30 

GAACTATGAT TTGGACTCAA GGGTTTGTAG ATTTCCTCCT TCATTCTAAT TTCAGTGTCT 240 

AAAATTCTTG CATCCRTGAA CGAGCTGGGC ATTTGATGAG ACAGGGCYGA ATACTGCAGT 300 

35 TTTCCTCCTA GAAATCATCT GGGGCATTTT CTTTGAACTG ATGGGAACAA TAAGGCATAA 360 

CTGTTTGCAC AAACTTGGGA TAARTGATTT TGGGATAACG ATCTACCAGA ATGGGGATAT 420 

TTCACCCTTG GTTCTGAGAT GCAAACCAAA GAATATCATG ACCAGCTTTC AGGCCTCCTG 480 

40 

AAGTATATCT CTCACATTGT CCTGTTCTCA TGCTGAGGAG CCTGAGATCC CTGTGTGGGG 540 

ATTAGACAGT GGACTGTTAT GGGTGTAGGT GAATTGGCTT ATTTTGTCTG TCCCTGTCTG 600 

45 AATGTATTGC AGGAAYTAAA AAGGACCAAG AAGAGGAAGA AGACCAAGGC CCACCATGCC 660 

CCAGGCTCAG CAGGGAGCTG CTGGAGGTAG TAGAGCCTGA AGTCTTGCAG GACTCACTGG 720 

ATAGATGTTA TTCAACTCCT TCCAGTTGTC TTGAACAGCC TGACTCCTGC CAGCCCTATG 780 

50 

GAAGTTCCTT TTATGCATTG GAGGAAAAAC ATGTTGGCTT TTCTCTTGAC GTGGGAGAAA 840 

TTGAAAAGAA GGGGAAGGGG AAGAAAAGAA GGGGAAGTU^G ATCAAAGAAG GAAAGAAGAA 900 

55 GGGGAAGAAA AGAAGGGGAA GAAGATCAAA ACCCACCATG CCCCAGGCTC AGCAGGGAGC 960 

TGCTGGATCA GAAAGRGCCT GAAGTCTTGC AGGACTCACT GGATAGATGT TATTCAACTC 1020 

CTTCAGTTGT GTTGAACTGT GTGACTCATG CCAGCCCTAC AGAAGTGCCT TTTATGTATT 1080 

60 
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GGAGCAACAG CATOTTGGCT TGGCTGTTGA CATGGATGAA ATTGAAAAGT ACCAAGAAGT 1140 

GGAAGAAGAC CAAGACCCAT CATGCCCCAG GCTCAGCAGG GAGCTGCTGG ATGAGAAAGA 1200 

5 GCCTGAAGTC TTCCAGGACT CACTCGATAG ATGTTATTCG ACTCCTTCAG GTTATCTTGA 1260 

ACTGCCTCAC TTAGGCCAGC CCTACAGCAG TGCKGTTTAC TCATTGGAGG AMCAKTACCT 1320 
TGGCrrKKCT CTTGACGTGG ASAAATTGAA AAGAAGGGGA AGGGGAARAA AAGAAGGGGA , 1380 

10 

AGAAGATCAA AGAAGGAAAG AAGAAGGGGA AGAAAAGAAG GGGAAGAAGA TCAAAACCCA 1440 

CCATGCCCCA GGCTCAGCAG GGAGCTGCTG GATGAGAAAG GGCCTGAAGT CTTGCAGGAC 1500 

15 TCACTCGATA GATGTTATTC AACTCCTTCA GGTTGTCTTG AACTGACTGA CTCATGCCAG 1560 

CCCTACAGAA GTCCdTTTA YRTATTGGAG CAACAGYGTG TTGGCTTGGC TGTTGACATG 1620 

GATGAAATTG AAAAGTACCA AGAAGTGGAA GAAGACCAAG ACCCATCATC CCCCAGGCTC 1680 

20 

AGCAGGGAGC TCCTGGATGA GAAAGAGCCT GAAGTCTTGC AGGACTCACT GGATAGATGT 1740 

TATTCGACTC CTTCAGGTTA TCTTCAACTG CCTGACTTAG GCCAGCCCTA CAGCAGTGCT 1800 

25 GTTTACTCAT 1GGAGGAACA GTACCTTGGC TK3GCTCTTG ACGTGGACAG AATTAAAAAG 1860 

GACCAAGAAG AGGAAGAAGA CCAAGGCCCA CCATGCCCCA GGCTCAGCAG GGAGCTGCTG 1920 

GAGGTAGTAG AGCCTGAAGT CTTGCAGGAC TCACTGGATA GATGTTATTC AACTCCTTCC 1980 

30 

AGTTGTCTTG AACAGCCTGA CTCCTGCCAG CCCTATGGAA GTTCCTTTTA TGCATTGGAG 2040 

GAAAAACATG rTCGCTTTTC TCTTGACGTG GGAGAAATTG AAAAGAAGGG GAAGGGGAAG 2100 

35 AAAAGAAGGG GAAGAAGATC AAMGAAGRAA AGAAGAAGGG GAAGAAAAGA AGGGGAAGAA 2160 

GATCAAAACC CACCATGCCC CAGGCTCAAC GGCGTGCTGA TGGAAGTGGA AGAGCSTGAA 2220 

GTCTTACAGG ACTCACTGGA TAGATGTTAT TCGACTCCGT CAATGTACIT TGAACTACCT 2280 

40 

GACTCATTCC AGCACTACAG AAGTGTGTTT TACTCATTTG AGGAACAGCA CATCAGCTTC 2340 

GCCCTTTACG TGGACAATAG GTITITTACT TTGACGGTGA CAAGTCTCCA CCTGGTGTTC 2400 

45 CAGATCGGAG TCATATTCCC ACAATAAGCA GCCCTTASTA AKCCGAGAGA TGTCATTCCT 2460 

GCAGGCAGGA CCIATAGGCA MGTGAAGATT TGAATGAAAG TACAGTTCCA TTTQGAAGCC 2520 

CAGACATAGG ATGGOTCAGT GGGCATGGCT CTATTCCTAT TCTCAAACCA TGCCAGTGGC 2580 

AACCIGTGCT CAGTCTGAAG ACAA'TCGACC CACGTTAGGT GTGACACGTT CACATAACTG 2640 

TCCAGCACAT GCCGGGAGTC ATCAGTCRGA CATTTTAATr TGAACCACGT ATCTCTGGGT 2700 

55 AGCTACAAAA TTCCTCAGGG ATTTCATrrr GCAGGCATGT CTCTGAGCTT CTATACCTGC 2760 

TCAAGGTCAK TC?rCATCTTT GTCTTTAGCr CATCCAAAGG TCTTACCCTG GTrTCAATGA 2820 

ACCTAACCTC ATICrrTCTG TCTTCAGTGT TGGCTTGTTT TAGCTGATCC ATCTGTAACA 2880 
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CAGGAGGGAT CCTTGGCTGA GGATTGTATT TCAGAACCAC CAACTGCTCT TGACAATTGT 2940 

TAACCCGCTA GRCTCCTTTG GTTAGAGAAG CCACAGTCCT TCAGCCTCCA ATTGGTGTCA 3000 

GTACTTAGGA AGACCACAGC TAGATGGACA AACAGCATTG GGAGGCCTTA GCCCTGCTCC 3060 

TCTCRATTCC ATCCTGTAGA GAACAGGAGT CAGGAGCCGC TGGCAGGAGA CAGCATGTCA 3120 
CCCAGGACTC TGCCGGTCCA GAATATGAAC AAYGCCATGT TCTTGCAGAA AACGCTTAGC . 3180 

CTGAGTTTCA TAGGAGGTAA TCACCAGACA ACTGCAGAAT GTRGARCACT GAGCAGGACA 3240 

GCTGACCTGT CTCCTTCACA TAGTCCATRT CACCACAAAT CACACAACAA AAAGGAGARG 3300 

AGATATTTTG GGTTCAAAAA AAGTAAAAAG ATAATGTAGC TGCATTTCTT TAGTTATTTT 3360 

GARCCCCAAA TATTTCCTCA TCTTTTTGTT GTTGTCATKG ATGGTGGTGA CATGGACTTG 3420 

TTTATAGAGG ACAGGTCAGC TGTCTGGCTC AGTGATCTAC ATTCTGAAGT TGTCTGAAAA 3480 

TGTCTTCATG ATTAAATTCA GCCTAAACGT TTTGCCGGGA ACACTGCAGA GACAATGCTG 3540 

TGAGTTTCCA ACCTYAGCCC ATCTGCGGGC AGAGAAGGTC TAGTTTGTCC ATCASCATTA 3600 

TCATGATATC AGGACTGGTT ACTTGGTTAA GGAGGGGTCT AGGAGATCTG TCCCTTTTAG 3660 

AGACACCTTA CTTATAATGA AGTATTTGGG AGGGTGGTTT TCAAAATTAG AAATGTCCTG 3720 

TATTCCRATG ATCATCCTGT AAACATTTTA TCATTTATTA ATCATCCCTG CCTGTGTCTA 3780 

TTATTATATT CATATCTCTA CGCTGGAAAC TTTCTGCCTC AATGTTTACT GTGCCTTTGT 3840 

TTTTGCTAGT GTGTGTTGTT GAAAAAAAAA ACATTCTCTG CCTGAGTTTT AATTTTTGTC 3900 

CAAAGTTATT TTAATCTATA CAATTAAAAG CTTTTGCCTA TCAAAAAAAA AAAAAAAAAA 3960 

AAAAAAAAAA AAAAAGCGGA CGCGTGGGC 3989 



40 



45 



50 



55 



60 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3735 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 29: 
CTCCTGTTCG CTGGCTGGGC TCCGCAGCAG GCTTGGCCAG CSGCTGACGG GTCGGCGGGC 
GGGTTICTGT GAACAGGCAC GCAGCTGCAG ATTTTATTCT GGTAGTGCAN CCCTCTCAAA 
GGTTGAAGGA ACTGATGTAA CAGGGATTGA AGAAGTAGTA ATTCCAAAAA AGAAAACTTG 
GGATAAAGTA GCCGTTCTTC AGGCACTTGC ATCCACAGTA AACAGGGATA CCACAGCTGT 
GCCTTA1GTG TTTCAAGATG ATCCTTACCT TATGCCAGCA TCATCTTTGG AATCTCGTTC 



60 
120 
180 
240 
300 
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ATTTTTACTC GCAAAGAAAT CCGGGGAGAA TGTGGCCAAG TTTATTATTA ATTCATACCC 




5 


CAAATATTTT CAGAAGGACA TAGCTGAACC TCATATACCG TGTTTAATGC CTGAGTACTT 




TCAACCTCAG ATCAAAGACA TAAGTGAAGC CGCCCTGAAG GAACGAATTG AGCTCAGAAA 


480 




AGTCAAAGCC TCTGTGGACA TGTTTGATCA GCTTTTGCAA GCAGGAACCA CTGTGTCTCT 


540 


10 


TGAAACAACA AATAGTCTCT TGGATTTOTT GTGTTACTAT GGTGACCAGG AGCCCTCAAC 


oOU 




TGA1TACCAT TTTCAACAAA CTGGACAGTC AGAAGCATTG GAAGAGGAAA ATGATGAGAC 


660 




ATCTAGGAGG AAAGCTGGTC ATCAGTTTGG AGTTACATGG CGAGCAAAAA ACAACGCTGA 


720 


15 


GAGAATCTTT TCTCTAATGC CAGAGAAAAA TGAACATTCC TATTGCACAA TGATCCGAGG 


780 




AATGGTGAAG CACCGAGCTT ATGAGCAGGC ATTAAACTTG TACACTGAGT TACTAAACAA 


840 


20 


CAGACTCCAT GCTGATGTAT ACACATTTAA TGCATTGATT GAAGCAACAG TATGTGCGAT 


900 




AAATGAGAAA TTTGAGGAAA AATGGAGTAA AATACTGGAG CTGCTAAGAC ACATGGTTGC 


960 




ACAGAAGGTG AAACCAAATC TTCAGACTTT TAATACCATT CTGAAATGTC TCCGAAGA-TT 


1020 


TCATCTGTTT GCAAGATCGC CAGCCTTACA GGTTTTACGT GAAATGAAAG CCATTGGAAT 


1080 




AGAACCCTCG CTTGCAACAT ATCACCATAT TATTCGCCTG TTTGATCAAC CTGGAGACCC 


1140 


30 


TTTAAAGAGA TCATCCTTCA TCATTTATGA TATAATGAAT GAATTAATGG GAAAGAGATT 


1200 




TTCTCCAAAG GACCCGGATG ATGATAAGTT TTTTCAGTCA GCCATGAGCA TATGCTCATC 


1260 




TCTCAGAGAT CTAGAACTTG CCTACCAAGT ACATGGCCTT TTAAAAACCG GAGACAACTG 


1320 




GAAATTCATT GGACCTGATC AACATCGTAA TTTCTATTAT TCCAAGTTCT TCGATTTGAT 


1380 




TIGTCTAATG GAACAAATTG ATGTTACCTT GAAGTGGTAT GAGGACCTGA TACCTTCAGC 


1440 


40 


CTACTTTCCC CACTCCCAAA CAATGATACA TCTTCTCCAA GCATTGGATG TGGCCAATCG 


IbOO 




GCTAGAAGTG ATTCCTAAAA TTTGGAAAGA TAGTAAAGAA TATGGTCATA CTTTCCGCAG 


IDOU 




TGACCTGAGA GAAGAGATCC TGATGCTCAT GGCAAGGGAC AAGCACCCAC CAGAGCTTCA 


1 con 


45 


GGTCGCATTT GCTGACTCTG CTGCTGATAT CAAATCTGCG TATGAAAGCC AACCCATCAG 


1680 




ACAGACTCCT CAGGATTGGC CAGCCACCTC TCTCAACTGT ATAGCTATCC TLiilUAAG 


1 740 


50 


GGCTGGGAGA ACTCAGGAAG CCTGGAAAAT GTTGGGGCTT TTCAGGAAGC ATAATAAGAT 






TCCTAGAAGT GAGTTGCTGA ATGAGCTTAT GGACAGTGCA AAAGTGTCTA ACAGCCCTTC 


1860 




CCAGGCCATT GAAGTAGTAG AGCTGGCAAG TGCCTTCAGC TTACCTATTT GTGAGGGCCT 


1920 


55 


CACCCAGAGA GTAATCAGTG ATTTTGCAAT CAACCAGGAA CAAAAGGAAG CCCTAAGTAA 


1980 




TCTAACTCCA TTGACCAGTG ACAGTGATAC TGACAGCAGC AGTGACAGCG ACAGTGACAC 


2040 


60 


CAGTCAAGGC AAATGAAAGT GGAGATTCAG GAGCAGCAAT GGTCTCACCA TAGCTGCTGG 


2100 



wo 98/54963 



PCT/US98/11422 



289 

AATCACACCT GAGAACTGAG ATATACCAAT ATTTAACATT GTTACAAAGA AGAAAAGATA 2160 

CAGAnTGGT GAATTTGTTA CTGTGAGGTA CAGTCAGTAC ACAGCTGACT TATGTAGAIT 2220 

5 

TAAGCTGCTA ATATGCTACT TAACCATCTA TTAATGCACC ATTAAAGGCT TAGCATTTAA 2280 

GTAGCAACAT TGCGGTTTTC AGACACATGG TGAQGTCCAT GGCTCTTGTC ATCAGGATAA 2340 

10 GCCTGCACAC CTAGAGTGTC GGTGAGCTGA CCTCACGATG CTGTCCTCGT GCGATTGCCC 2400 

TCTCCTGCTG CTGGACTTCT GCXrXTTGTTG GCCTGATGTG CTGCTGTGAT GCTGGTCCTT 2460 

CATCTTAGGT GTTCATGCAG TTCTAACACA GTTGGGGTTG GGTCAATAGT TTCCCAATTT 2520 

15 

CAGGATATTT CGATGTCAGA AATAACGCAT CTTAGGAATG ACTAAACAAG ATAATGGCAG 2580 

TTTAGGCTGC ACAACTGGTA AAATGACTGT AGATAAATGT TGTAATTAGT GTACACGTTT 2640 

20 GTATTrTTGT TAATATAGCC GCTGCCATAG TTTTCTAACT TGAACAGCCA TGAATGTTTC 2700 

ATCTCTCCCT TTTTTTTTTG TCTATAGCTG TTACCTATTT TAGTGGTTGA AATGAGAGCT 2760 

AGTGATGACA GAAGGATGTG GAATGTCTTC TTGACATCAT TGTGTATTGC TGGTAATCAA 2820 

25 

GTTGGTAACG ACTACTTCTA GCAGCTCTTA CXACTATGAC TTAAGTGGTC CTGGAAGGCA 2880 

GTAAGTGGAG GTTTGCAGCA TTCCTGCCTT CATGAGGGCT TCTACCACTG ACCACTTTGC 2940 

30 ACGTACCTGG CTCCCAGATT TACTTAGGTA CCCCACGAGT CX3TCCACATA AGCAGCTTCA 3000 

TCTTTACCTT GCCAGAGTTG ACAATTATGG GATACTCTAG TCTACTTATA CTTGTGTTCC 3060 

CATCTGTCTG CCATCCTCTG AAGGCCAGGA CCCAGTCATA CATCCTTAGA AACCAAAGTA 3120 

35 

TGGTTTrrGT TTTCTCTTGG AATGTCAGGT CTTAAGGCAT TTAATTGAGG GACAAAAAAA 3180 

AAAAAAAGCC GATATAGTAG CTAGCTACTT AAGCATCCAT GGGTATTGCT CCATATCAAA 3240 

40 GCAGATTTGC AGGACAGAAA GAGTAAATTA GCCTTCAGTC TTGGTTTACA GCTTCCAAGG 3300 

AGAGCCTTGG CCACCTGAAA TGTTAACTCG GTCCCTTCCT GTCTCTAGTT CATCAGCACC 3360 

TGCAGATCCC TGACTCTTGT TAGCCTTACT ATTCAATACA GTCCTTAGAT TCACGGTATG 3420 

45 

CCTCTTCCTA TCCAGGCACC TATTCTGAAT CACCATGTTG CTCTGCAGCT AGAGTTGATA 3480 

GGAGAAAATC CATXTGGGTA GATGGCCTAT GAATTTGTAG TAGACTTTCA AAATGAGTGA 3540 

50 TrrCTTAGCT TGGTACTTTT AAGTTTGTGG TACAGATCCT CCAAACCCAT ACTCTGAGCA 3600 

ATTAACTGCC TTCAACATAG AGAAAATTAA GGCCTCACAG GATGAGTCTC CATTCTCTGT 3660 

AAATGCTTAT TTTATCATAG TCTTTAGCCN CTACTATGAG TAAAATGTTC TCTTCNGCCG 3720 

55 

GGTGTGGTGA CTCAC ^^35 

60 
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(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 1667 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
TAGTAATTCA TTTAACTCCT CTTACATGAG TAGCGACAAT GAGTCAGATA TCGAAGATGA 
AGACTTAAAG TTAGAGCTGC GACGACTACG AGATAAACAT CTCAAAGAGA TTCAGGACCT 

15 GCAGAGTCGC CAGAAGCATG AAATTGAATC TTTGTATACC AAACTGGGCA AGGTGCCCCC 180 

TGCTGTTATT ATTCCCCCAG CTGCTCCCCT TTCAGGGAGA AGACGACGAC CCACTAAAAG 240 

CAAAGGCAGC AAATCTAGTC GAAGCAGTTC CTTGGGGAAT AAAAGCCCCC AGCTTTCAGG 300 

TAACCTGTCT GCTCAGAGTG CAGCTTCAGT CTTGCACCCC CAGCAGACCC TCCACCCTCC 360 

TGGCAACATC CCAGAGTCCG GGCAGAATCA GCTGTTACAG CCCCTTAAGC CATCTCCCTC 420 

25 CAGTGACAAC CTCTATTCAG CCTTCACCAG TGATGGTGCC ATTTCAGTAC CAAGCCTTTC 480 

TGCTCCAGGT CAAGGAACCA GCAGCACAAA CACTGTTGGG GCAACAGTGA ACAGCCAAGC 540 

CGCCCAAGCT CAGCCTCCTG CCATGACGTC CAGCAGGAAG GGCACATTCA CAGATGACTT 600 

GCACAAGTTG GTAGACAATT GGGCCCGAGA TGCCATGAAT CTCTCAGGCA GGAGAGGAAG 660 

CAAAGGGCAC ATGAATTATC AGGGCCCTGG AATGGCAAGG AAGTTCTCTG CACCTGGGCA 720 

35 ACTG1X3CATC TCCATGACCT CGAACCTGGG TGGCTCTGCC CCCATCTCTG CAGCATCAGC 780 
TACCTCTCTA GGTCACTTCA CCAA3TCTAT GTGCCCCCCA CAGCAGTATG GCTTTCCAGC 
TACCCCATTT GGCGCTCAAT GGAGTGGGAC GGGTGGCCCA GCACCACAGC CACTTGGCCA 

GTTCCAACCT GTGGGAACTG CCTCCTTGCA GAATTTCAAC ATCAGCAATT TGCAGAAATC 960 

CATCAGCAAC CCCCCAGGCT CCAACCTGCG GACCACTTAG ACCTAGAGAC ATTAACTGAA 1020 

45 TAGATCTGGG GGCAGGAGAT GGAATGCTGA GGGGGTGGGT GGGGGTQGGA AGTAGCCTAT 1080 

ATACTAACTA CTAGTGCTGC ATTTAACTGG TTATTTCTTG CCAGAGGGGA ATGTTTTTAA 1140 

TACIGCATTG AGCCCTCAGA ATGGAGAGTC TCCCCCGCTC CAGTTATTGG AATGGGAGAG 1200 

GAAGGAAAGA ACAGCmTT TGTCAAGGGG CAGCTTCAGA CCATGCTTTC CTGTTTATCT 1260 

ATACTCAGTA ATCAGGATGA GGGCTAGGAA AGTCTTGTTC ATAAGGAAGC TGGAGAACTC 1320 

55 AATGTAAAAT CAAACCCATC TGTAATTTCG AGTGGGTGGA GCTCTTGCTT irGGTACATG 1380 

CCCTGAATCC CTCACTCCCT CAAGAATCCG AACCACAGGA CAAAAACCAC CTACTGGGCT 1440 

CTCTCCTACC CTGCCCICCT CCCnTTTTT TACCCCTCTC TTTTTTATTT TTTCTTTGCT 1500 



840 
900 
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CTTTAGAACC CAGTGAAAAA TACCAGC3GTA CTGGGGTGCA ACTCTTTCTT ATGATAGGTC 1560 
ATTAGTGCTT TAAGCAAAAG ATATTAGCAG CTTTGACTGC AGCATTAGCA ATTAGGRAAA 1620 
5 AAAAAAANWA AAAACTCGAG GGGGGGCCCG GTTACCCAAT TCGCCCT 1667 



10 (2) INFORMATION FOR SEQ ID NO: 31: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LE^3GTH: 1408 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOUOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
ATTACACACC TGAGCACTGT GCCTGGCAAG ACCTGTCTTA 
TAGATGGTCA GCTTTCTGTA GCAGTGAGAA CCCTACATTT 
GCGGGGAAAC ATCACTTGGC ACATCTGCAT TCTTTTTTGA 
CCCAGGCTAG AGTGCATGGC ACGATCTTAG CTCACTGCAA 
GCGATTCTTC TGCCTCAGCC TCCTGAGCAG CTGGGATCAC 
AGCTAATTTT TTGTATTTTT TGTKTGTTTG TTTTTGTTTK 
CCACGTTGGS CAGGCAGGTC TCGAACTCCT GAMCTCAGGT 
CCAATATCTT TCTCAACATA ATGATAGCCG TAATTAATAT 
CITTACACAC GAGAGTGGTA GACAGACACA AACCCAGATC 
TTGTCATCAT TCCTITTACG GTAT CCTATA GTGGTATCCT 
CCCAACAAAG ACTTAACTTC CCAGGATGCC AGAAGGACAA 
GRAAGTTATC AAGAMCTTAT TTTATAAATG AGATTAGATA 
ATTAAAAACT GAAAAGGCCA GCATAGGGAA GGAGGTCCTT 
ATACTTCAGT TGCTTTTATT AGAAACAGAT AGTACCTAAG 
TAAGGCATGC TAATGKTCAT GGGTCCTTCC ATAGTCATTT 
GAGCAATAGG CAGCCCTTCA CTGCTGCTGG AYTCATTCCT 
AGGAGACAGG AGGTATGTCT TTTCTATTTT TAWACATGCT 
TGGGTATCTT AGATAAACAG AAGTTGCCTA GCACTCCTTT 
CATTTAAGCA AAATAATAAA CAGTCTTTTG AGGTTCCTTA 
GGCAGCAGCG GAATCCATGC YTCTTCTCCT GGAGTGTGCA 
TCTCACACAG ATCTGGCATT TTATGTGTGA TGCTCTAATT 



; 31: 

ATAGATTAGA 
CAAATGTGGA 
CACAGGGTCT 
CCTCCACCTC 
AGACATGCGC 
TAAGTAGAGA 
GATCCACCCA 
TTTCCAGTAC 
TGTCTGACTC 
TTACAGAAAG 
AGCGGGATTG 
GGGAAAGGCA 
CGGTGGTCTT 
GTTTTGAGGT 
TKGTATTTTG 
GCCAYTATTA 
TTATATTTAA 
TAGTGCATTG 
ACAATGAAAC 
AKAGTCCGTG 
AAGGCCATTG 



GAACCACTGA 
TAGCACCTTT 
CACTCTGTTG 
CCAAGTTCAA 
TACCATGCCC 
CGGGCTTTCA 
CATCTGCGTT 
ATTTTTATGC 
CAAAGCCCGT 
ACAGCTTTTA 
CTTTTAAGRA 
ATTTATCTTT 
TTTCAGGGAA 
AGGWACAGCT 
GTTVIACATTT 
CAGGTGACAG 
CACAAGCTCT 
AACCCTTTAA 
GTGTTCGAGT 
GTCCTGAGTA 
GTACAGAACC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
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AGA.TTCAGAC GTr:CTCTCAG A-^-^TAATC-CA TTCTTTTGCA AAGGTGAATA TTXTTCTCrr 1320 

-JiAAAAT-ATG TA-TAAGCrGG TArCTTCATr TATTAGTCTT GCTAAAAAAA AAAAAAAAAA 1380 

5 

ACm>)GA3C-C- GG3Gt:CC3GT ACCCAATT 1408 



10 
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40 
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(2) INFCH:^!^ for SZQ id NO: 32: 



(i) SEQ'UETCE CHAFACTERISTICS : 

(A) LZ^iGTH: 2031 base pairs 
15 (3) Tf?E: nucleic acid 

CC) STHA^CETNZSS : doioble 
(D) T0?0LC<7/: linear 



(xi) SZQUS'CH DESOIIPTICN: SEQ ID NO; 32: 

AGGA.TATGCA TGATTCTTAA CCAGGCTATA TGTTAAAAAA AAATTGGAAA ATGCAATACA 60 

TrmTAITA TAJCAAACTAC AGAATGAGTA TGCAAGTTTT ATTTATCAAA ATGTAATGGA 120 

25 TTTITAAAGG CTGAGAAATT TTCCTTATAC CTACCTTTTC AGTTATTTTA ATTATACCAA 180 

ATTATCAACT Aa-_ATAGCTT C-.TCCATATG AAATATAAAA TGAAGAGACA CCTAGGCTCT 240 

ATCAGGCITA GGATTCTTTG AACTTAITTC CACTTTAATT TCTCAGTGGA AJ3TTAAGAGG 300 

GGTGAGAAAA CAAAGAAGGG aW^J^CTGA CAACTAACAA AACCAGCACC ACATCGCTAG 360 

GTGGTCCITA CTAATOrCT TCTCAGGATT TTCCTCAGAT TGAAAAGCTT ATGAGGATTT 420 

35 CTiCC^STC TTAATPJ^CCT GCCTGITAGT ACAGAGCTTT CCTGATGATA TTTACTCTTG 480 

AGCACATOTG GTTGTAAAAC CTTAACrTTC TTTCTCCAGG AGGGTGGTGA TAGi^AACAGA 540 

TGGTAGTATT TAIGAACTGA TCriTCTCGTG AAATGTTGAG GGTGGGGAGA AA^AGACTTTA 600 

AGGGAGGAGA GCCATCTATT TTCTTCCTAA AGCCACCTCT CAGCAGAATC GTCATGTTTT 660 

TCTCATGCAC CGCTCTGCTT CATGCCCAAG ATGACTTGCG AGGCAATCTC AGGAGCTGTG 720 

45 GACTTAACCR TTCCAAAGCA CACTGTCTTT CTCAGCGTTC TCTGCAAGTC AGTAGGTGTT 780 

AGTATCGITG CAAAGTTCAC TGTCTCAGCA AAGOTGAACT GGGCTACCTC TCTACAGCTG 840 

TTTCCTCAGA GGGAAAAATC TTGAGACCAG ATGGTGGAGC TCTGGAGTCA GAGGAAATGG 900 

CTICTCTTCAG CACAAAGCTG CTGCTtTTAC TTCAGCCACT TCTGACATTT TTACATACCG 960 

AGCCTGAJ3AT TRTGTGATTA TCTCAAATCA AATCACTTTG ATGGAGATAA ATAATCAAAA 1020 

55 CTCTTTTATA GTCATTGATT TGGTGAGAAC AGTAATGGAA AATGGTGTTG AAGGACTTCT 1080 

cATirriGGA GCirrccrrc cagagtcctg gctgattggt gttcgctgtt catctgagcc ii40 

CCCAAAACCA TTATTACTGA TACTTGCACA CAGTCAAAAG CGCAGACTGG ATGGATGGTC 1200 

60 
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TTTTATAAGG CATTTAAGGG TACACTACTG TGTTTCACTG ACCATACATT TTTCTTAGCC 1260 

CCTCAAGTAA TATAGCACAG AGTTATGAAT GACAATTCCC CTAACCATTC CTCTTCATAT 1320 

CTGCCTCTTC CCCTTACCAT CGTAATTCTC CAAACTGGTC ATAAAGGCAC TCTGTGAAGA 1380 

TATTGGGGAC TGACATCTTA AGCTCTCACC TGGCTGCAGT AGGAAAGGCC AAACTGACGA 1440 
CAAAAAAAAA ATTCTTTATA AAGATGATAT GGTAACATGT ATCTTTGCCC TGGGTCTGGG - 1500 

TGGGTCCAGT CAGTCTCAGA TTTACAAGCA TTTAGGAGCC TAGGTAAAAG CTGCTAGTAT 1560 

TCTTTTAAAA GTrACATTTA TGACTTGCAA TGATAGAAAA CTCCTTCCAA TTAAATGGCA 1620 

15 TITTATAATA TTATGTGrTGT ACTTCACAGT GTTAAAAATA CCCTCATACG TTATTGCATT 1680 

TGATCTTCAC AGAAAGTGCA TTTTAACCAG TACTCTGGGT GCAATAAATA ATATGTAGAA 1740 

ATTTAAGTCC TCCAATTCCA GCATATCCAG TGAGTTTTGA CAGTGTGTTT ATGTGGAATG 1800 

TTTAAGGATA TACAATTGTA CTTTATATAA ATTGGTTCTT GTTCTTCTTA AATGTGACAT 1860 

GAAATAATTG TGCTCCTACA TTATACTGGA AATTAACAGG GGAAAAGGGA AGAGCTCTTG 1920 

25 GCrcCCTTGA GGTTCTGCTA GTGGTGrrAG GAGTGGTTAC AACTGAGCTT TTAGTAACCA 1980 

TTTAACCGTA TGTAAACTTG GTTTCTAATT AAAAAAAAAT TTCTTriTCC A 2031 



20 



30 



45 



55 



(2) INFORMATION FOR SEQ ID NO: 33: 



(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 971 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDMESS: double 

(D) TOPCLiOGY: linear 

40 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

CGCGTCGGAA CTCGGCCGCG GGACATCCAC GGGGCGCGAG TGACACGCGG GAGGGAGAGC 60 

AGTGTTCTGC TGGAGCCGAT GCCAAAAACC ATGCATTTCT TATTCAGATT CATTGTTTTC 120 

TTTTATCTCT GGGGCCTTTT TACTGCTCAG AGACAAAAGA AAGAGGAGAG CACCGAAGAA 180 

GTGAAAATAG AAGTTTIGCA TCGTCCAGAA AACTGCTCTA AGACAAGCAA GAAGGGAGAC 240 

50 CTACTAAATG CCCATTATGA CGGCTACCTG GCTAAAGACG GCTCGAAATT CTACTGCAGC 300 

CGGACACAAA ATGAAGGCCA CCCCAAATGG TTTGTTCTTG GTGTTGGGCA AGTCATAAAA 360 

GGCCTAGACA TTGCTAIGAC AGATATGTGC CCTGGAGAAA AGCGAAAAGT AGTTATACCC 420 

CCTTCATTTG CATACGGAAA GGAAGGCTAT GCAGAAGGCA AGATTCCACC GGATGCTACA 480 

TTGATTTTTG AGATTGAACT TTATGCTGTG ACCAAAGGAC CACGGAGCAT TGAGACATTT 540 
60 AAACAAATAG ACATGGACAA TGACAGGCAG CTCTCTAAAG CCGAGATAAA CCTCTACTTG 



600 
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CAAAGGGAAT TTGAAAAAGA TGAGAAGCCA CGTGACAAGT CATATCAGGA TGCAGTTTTA 660 

GAAGATATTT TTAAGAAGAA TGACCATGAT GGTGATGGCT TCATTTCTCC CAAGGAATAC 720 

5 

AATCTATACC AACACGATGA ACTATAGCAT ATITGTATTT CTACTmTT TTTTTAGCTA 780 

TTTACrGTAC TTTATGTATA AAACAAAGTC ACTTTTCTCC AAGTTGTATT TGCTATTTTT 840 

10 CCCCTATGAG AAGATATTTT GATCTCCCCA ATACATTGAT nTGGTATAA TAAATGTGAG 900 

GCTGTTTTGC AAACTTAAAA AAAAAWWAAA AAAACTSGAG GGGGGCCCGT ACCCAAOTCG 960 
CCGNATATGA T 

15 



20 



30 



40 



50 



(2) INFORMATION FOR SEQ ID NO: 34: 



(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 1792 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
25 (D) TOPOLOGY: linear 



971 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: * 

GAACCCCCTT TCTCCTGGTA AAGGGTAAGG GGGGGGATAA TGTTTACCAC AGGTACGAAA 60 

TAGTCACITT AACATTGAGA CCTCTGCCTC ATTGAATTCA GGTTTTTTAA GTACTTGAAA 120 

CTCTTCAGAT TCTCCTTATT TTAGTTTCTT TTTACATTTA TGAAGTAGAA AGCATTGTTT 180 

35 TGTAAACTGT TTTGAAAATA AATAGCCTAG TCTCTTATCC TCTTTAGCGT GGATTAAAGG 240 

TGAAGnCTG CAAATGGGAG AGTGTTCACA GTAGATAGCT CAGATTGATT GAACACATTT 300 

GAGGAAGAGA CTCCTGCATG AGATACCAGC ATTTTTACAA ATACTTTTTA TGTACATTCT 360 

TTATTTIGTC ATTITGTCAA CCCTCTCCCC AAGCACATCT TdTTCCTTT TACTATGTCT 420 

ATCTAGGGAA AAACAAAACA AAAAATTGCA CTTACGTTAC ACTCCCAAAA TGTGGGTAAT 480 

45 CCGTGTCTrr caaaaaacat ttctgttttt TGTTTTGTTT TGGTCAGTCC ATTGCATAAG 540 

TGACAACJTTT GGGTGCTTGT GGCACGTATG TATGAAGCGG GAGGGGGATG ASAATTGCCT 600 

GTCCTTCAGT ARGCTGTAAA AGTAATTTAC ATGTAAGTAA AAAGGGAAAA TAGAATAGAT 660 

GCCAAAGTCA TTTArrCAGT CCTTAGTTrr CTTATGTGGC ATTACTGCAT CTGCTAGTTA 720 

GTGAGAAAGC ACCCTCAGCT TTTACTGCTC CCCTCCCTGC CTGCCAACAC ACTTGATGTG 780 

55 -rcCAAACAGC CCTCAAGTAT CTGTCAGATG ACCTATATAA GGTATTGAAT AAGGTATTCT 840 
TOTCAGTTTA GAAATGGACT GGATAAAACT TACTTGGTTG TCATTATTTT ATCTCATTTG 



900 



60 



TCCTCTTACA TGCCCTATGT TAAGATAATT ATATTGCCAC TAATAATCAA GATGCTAAAT 960 
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10 



GAGTATTACA ACTCGCTAAT ATCATTTTTT ATATACAAGG GTATGTGTAT ATTTGGAATT 1020 

GPTATCAGAA ACTCATTTGT ACCCATTTGA GTGATATTGC ACAACAAACA CAGATAYCTA 1080 

CAGACTCCGT TTTCATTTTC TCGTGTTCTT TATQATAATG ATCTTTGTAG ATTGGTTATT 1140 

TCTGTACTTT ATCTCTAATA AACTTTGTAG ATCCTGTGAA CCATTACTTT GCCTAAATCA 1200 

CTTGAGACTT GAGTCTTTAA TAACAAAGCA TCAATATTCA CTAAAGTCAA TCTCTTTTGA - 1260 

GTTTCTGTGA CITGGCTAGA AGCTCTTGAC ACTAAGGGAT TAGTGTTAAT TTTCCCTGGG 1320 

GCyrCTTCCAC TAGGGCATTA CTGTATAATG ACTTGATGTT GCCACATAGA CTTCAAGATA 1380 

15 TATAATATTT TGAGGATTTT GrTGATTGGC CTATGTTTTA TTGCATAGTG TGAAACGTGT 1440 

AAAGCTIGGT TAACCTGTAT ATAGATAGCT TATTGTTCAC TAGTTATAGT GTATTTAGGG 1500 

TIGCCTOTAA TATTTAAGCT TCTTTACTGA TGTGTCTGCT GGTAGGAACA TATAATTTTT 1560 

GTACATTATA TTTACTGAGA TGTTGCCTTT TTTATTTTAC AAATACTTTG GAATTCCAAT 1620 

GTGTTTTTTG CTTCCGTCAG GATTAATTTG GAAAGGTTTT TAATGACATT CCACTGATTT 1680 

25 CAGATTTTGC TTGAGA1TCA CTTCAATAAA TTGTCCTGTA TGTTCCAAAA AAAAATTAAA 1740 

AAACTCGAGG GGGGCCCGGT ACCCAANNCG CCGGATATGA TCGTAAACAA TC 1792 



20 



30 



45 



55 



{2) INFORMATION FOR SEQ ID NO: 35: 



(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 896 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : dOTOble 

(D) TOPOLOGY: linear 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

AGTTGNANAC AACAGGACCT GAGTCCTTGG GCAGCACCAG TAGGTTGCCC CYTGCYTCYT 60 
GCCAGCYTCA CYTGCCACYT TVTGCCCCTY TCGGGATGCC TTCGCAGACA GAGYTYTTCG 120 
CTGCCTGTGG TGGCCAYTCT TTGCTTTTGG TTYTCTTGCC CCTTGGCCTC CCTTTTTGTC 



180 



CCCGGGCAGC CTTGTGTGAC CTGCCCITTT CCCTCCCTTC CTTTCCAGGA CAAGCACGCC 240 



300 
360 



50 GAGGAGGTGC GGAAAAACAA GGAGCTGAAG GAAGAGGCCT CCAGGTAAAG CCTAGAGGCC 
AAAGAACTTT CCAGGTCAGC CGGACAGCTC CAGCAGCTCC ACGTTCCAGG CAGCCTCGKfC 

CGCCGGCTGC GCTCCCAGCA CTGGGGTTTG GGGGGAGGGG GGTGGCCAAG GGGCGTTTCC 420 

TCTGCTTTTG GTGTTIGTAC ATGTTAAGAA TTGACCAGTG AAGCCATCCT ATTTGTTTCC 480 

GGGGAACAAT GACGGGGTGG GARAGGQGAG AGGAGAGAGT TTGGGAAAGG GAGATGGAGA 540 

60 AGAACTCAAG GACATTCCAA CCdGCCCGG CGCAGATCTG ATTTTCACAT CTCTACCTGG 600 
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296 

ACATTCAGCC TCCCAGGCAC CATGTTGAGG AGAGATGAAA ACCAGGGCGG TAGAACTTCA 660 

GGGTGAAGGA CAGAGTCCTG GGTGGGGCAG CGGCTGCAGG GCGCACCAGA GAACCCAGCC 720 

5 

AGAGGGGGTG TGACTACCAG TGGTGTTGCT TCCACCCTGC AGCAGGTGGG ATGAGGTCTG 780 

TCTGTGTCTG TGAACCATCA TTTTTTGATC ATCATGACCA ATGAAACATT GAAAAAAAAA 840 

10 AAAAAAACTG GAGGGC3GGCC CGTACCCAAN TCGCCGNATA GTGATCGTAA ACAATC 896 



15 (2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 912 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 



25 


TCGACCCACG CGTCCGGTCA GCCAGTCGCA TCCAGCCATG ACAGCCTTCT 


GCTCCCTGCT 


60 




CCTGCAAGCG CAGAGCCTCC TACCCAGGAC CATGGCAGCC CCCCAGGACA 


GCCTCAGACC 


120 




AGGGGAGGAA GACGAAGGGA TGCAGCTGCT ACAGACAAAG GACTCCATGG 


CCAAGGGAGC 


180 


30 






240 




TAGGCCCGGG GCCAKCCGCG GCAGGGCTCG CTGGGGTCTG GCCTACACGC 


TGCTGCACAA 




CCCAACCCTG CAGGTCTTCC GCAAGACGGC CCTGTTGGGT GCCAATGGTG 


CCCAGCCCTG 


300 


35 


ARGGCAGGGA AKGTCAACCC ACCTGCCCAT CTGTGCTGAG GCATGTTCCT 


GCCTACCATC 


360 




CTCCTCCCTC CCCGGCTCTC CTCCCAGCAT CACACCAGCC ATGCAGCCAG 


CAGGTCCTCC 


420 




GGATCACYGT GGTTKGGTGG AGGTCTGTCT GCACTGGGAG CCTCARGARG 


GCTCTGCTCC 


480 


40 






540 




ACCCACTTGG CTATGGGAGA GCCAGCAGGG GTTCTGGAGA AAAAAACTGG TGGGTTAGGG 




CCTTGGTCCA GGAGCCAGTT GAGCCAGGGC AGCCACATCC AGGCGTCTCC 


CTACCCTGGC 


600 


45 


TCTGCCATCA GCCTTGAAGG GCCTCGATGA AGCCTTCTCT GGAACCACTC 


CAGCCCAGCT 


660 




CCACCTCAGC CTTGGCCTTC ACGCTGTGGA AGCAGCCAAG GCACTTCCTC 


ACCCCYTCAG 


720 


50 


CGCCACGGAC CTYTYTGGGG AGTGGCCGGA AAGCTCCCSG GCCTYTGGCC 


TGCAGGGCAG 


780 


CCCAAGTCAT GACTCAGACC AGGTCCCACA CTGAGCTGCC CACACTCGAG AGCCAGATAT 


840 




TTTTGTAGrr TTTATKCCTT TGGCTATTAT GAAAGAGGTT AGTGTGTTCC 


CTGCAATAAA 


900 


55 


CTTGTTCCTG AG 




912 



60 (2) INFORMATION FOR SEQ ID NO: 37: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1382 base pairs 

(B) TYPE: nucleic acid 
CO STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
AATTCGGCAC GAGCGGAGGC GAGGGAAACT RAGGGCGAAA GTTGTGTGTC GTGTTGGCAG 
GAGGGCCTAG AAGGGAAAGA CTGTCTAGTG GGACAATGTC ATATTATAAA TITGGAATGC 
TGAATAGAAA ATTATAGATT TTGATATTGA AGGAAATGAA GCGAAGCYTA AATGAAAATT 
CAGCTCGAAG TACAGCAGGC TGTTTGCCTG TTCCGTTGTT CAATCAGAAA AAGAGGAACA 
GACAGCCATT AACTTCTAAT CCACTTAAAG ATGATTCAGG TATCAGTACC CCTTCTGACA 
ATTATCAITT TCCTCCTCTA CCTACAGATT GGGCCTGGGA AGCTGTGAAT CCAGAGTTKG 
CTCCTGTAAT GAAAACAGTG GACACCGGGC AAATACCACA TTCAGTTrCT CGTCCTCTGA 
GAAGTCAAGA TTCTGTCTTT AACTCTATTC AATCAAATAC TGGAAGAAGC CAGGGTGGTT 
GGAGCTACAG AGATGGTAAC AAAAATACCA GCTTGAAAAC TTGGRATAAA AATGATTTTA 
AGCCTCAATG TAAACGAACA AACTTAGTGG CAAATGATGG AAAAAATTCT TGTCCAATGA 
GTTCGGGAGC TCAACAACAA AAACAATTAA GAACACCTGA ACCTCCTAAC TTATCTCGCA 
ACAAAGAAAC CGAGCTACTC AGACAAACAC ATTCATCAAA AATATCTGGC TGCACAATGA 
GAGGGCTAGA CAAAAACAGT GCACTACAGA CACTTAAGCC CAATTTTCAA CAAAATCAAT 
ATAAGANACA AATGTTGGAT GATATTCCAG AAGACAACAC CCTGAAGGAA ACCTCATTGT 
ATCAGTTACA GTTTAAGGAA AAAGCTAGTT CTTTAAGAAT TATTTCTGCA GTTATTGAAA 
GCATGAAGTA TTGGCGTGAA CATGCACAGA AAACTGTACT TCTTTTTGAA GTATTAGCTG 
TTCTTGATTC AGCTCTTACA CCTGGCCCAT ATTATTCGAA GACTTTTCTT ATGAGGGATG 
GGAAAAATAC TCTGCCTTCT GTCTTTTATG AAATCGATCG TGAACTTCCG AGACTGATTA 
GAGGCCGAGT TCATAGATGT GTTGGCAACT ATGACCAGAA AAAGAACATT TTCCAATGTG 
TITCTGTCAG ACCGGCGTCT GTTTCTGAGC AAAAAACTTT CCAGGCATTT GTCAAAATTG 
CAGATOTTGA GATGCAGTAT TATATTAATG TGATGAATGA AACTTAAGTA GTGATAAAAG 
GAAGTTTAGC ATAAATTATA GCAGTTTTCT GTTATTGCTT AATTTACCAT CrCCATAGTT 
TTATAGCTAC TATTOTATTT CACTTGTTGA ATTAAAGTAT TTGAATTCTT TTAAAAAAAA 
AA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1382 



60 
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(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 872 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
GGGCTACTTC AAAGCCCTGG GCCTTATTTC TTCAGGTAAA AAAATATAAA GTCAGATCTC 
ATCCCGGCTG GCCATGCTGT TAGACCCTTT CATCCTTCTC TTCTGCCTCT TCTCAACAGC 
TGCCCAGTCC TGTTTGGAAT TCATATACAT ACAGTTCTAA TACTGATGTA TTTACCCTCA 
TAAGCCACTC AACCCAGAAT CTTATTTGAA TTATAATCCA GAAACATCAG GTGACGTGTG 
AGACTACTGT ATGAGAAAGA GACAGTTTAA GGGTCAGTCC AATGGAAAAA AGAGTTCTCA 
GAGCnTCTT TAGCTTATTC TCATCAAAGA GCTTTCTCTG CAGAAGGAAC CTACTGGTTC 
CTCCTTTCCA GTCCTAGAAA TCCTGACCTA GAGTGGCTTA ATCCTGCTAG CACCTCTCTC 
TCGCACTCTG GTGCCAAATG ACTCCAGGAA CTGGGCCATG ATGTGGTGGG AATGACCTTA 
CCCTGAGCAT GTCACTCATG CATTGAACAA CAGCTAAGAG CAGAGCTTAG AGCTTAGAGC 
TGGGCCCTGT AAGGTGAGAG GAATCACATC CTGCAGAAGT CTGTCCTGAG AAGCAGGTAC 
TCCTCrrCACA GCAGAGACAC AGTGGATACC TGAGTAACAA TAATACAAGA CAGGACGTGG 
GMACAGCAAA AGATTTGGGT GTCAGAAGAR GCCGAGAACA CTTYCAGGCA GGAACATTCA 
RARTTGTTCT TGGAGGAART AGGCMCSAAG GCTGGGCAGG ATTTCMCGGG GCAGAGATGG 
AGCAAGCAAT TGAAATGAAA GCCATGGCAT GGGAAAAGGA GCACTGGCCA CAGGGAGTGC 
AACGTTGTGA TGCAAGGCCA CTGTGGAGCC AT 

(2) IIOFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 812 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
GGCAGAGGCT CACCCCAGCA GAGATTGAGG GGGAACCGTG ATGAAATTTT TAAGTATTCT 
GCTTGATCAT AATAATTTTY CTCTTATGTT AATGTTGGCT CCGTTTGGGT GTTTAGCTTT 
TGAAAGGAGT A1GAAAATGC GGAATGGGGC TTTGGGGCTT GAGGAGGTGT GATCTCTAC3T 
GTTTAAAAAA TTTAATTGCA CAAATAGAAA TAATTCACCC ACATTATTGA ACCCCACTAA 
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15 



40 



50 



AGCATATCCT TnTCTCCAT ATTCCTTTCC TGCTGCCCTC GTGTGTACCA TTATTACTCA 300 

GITCTGATTT GAGCTCGTTC CACTTAAAGT CATTCATAGA TACTTTTGCG TCGTGTTKGA 360 

^ ATATTTATTG AATTTCTATT CTGTGTTTTA CTTAATTACT TTATTATGGA ACCTTTACAC 420 

AGGTCTGGTG TAdTGTTCT TTGAAAAGTC TTATGTTGAC CACCATCACT GAGCATATAG 480 

10 CTTTTTCCTT ATTTCCTTGG GATAATTACC CGAAGTGGAA ATACCGAATC AAACTTCTGT 540 

'rricr r rcTT tggcactatt atataaattg ttttccaaac aaggcatgtt tacaatagac 600 

ATTTTTCAAA ATCTCGGTAT TTGTCCTATT TTGCTCTCTG TATGCAGAAT TCAGCGGGGT 660 

GCCAAGTCGT TTTCTGTGTG GGTTGAGAGA CAGGCTGTGC AGCCCACTGT TGCATAGGAC 720 

TAACTACTAC AAATCATGCT GAGACCGAGC TATrTTTGCT GCTTAGARGC TTTGCAGCCT 780 

20 TGAGTAAGTT TCGNCATCTG GAAACNTTGN AA 812 



25 (2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1515 base pairs 
tB) TYPE: nucleic acid 
30 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

35 AATTCGC-CAC aAGC^XAAATT CAAGCACTTT TCCTAAAAGA AGGGGGAATG GATGCTGAAA 60 

CAACACGTNT CCCACAAAGG GAGCAGACAC TGGGCTTGTG AAGCTGCCCC ATACCTTCCC 120 

CACAGAACTO GGGTCCGGCC TCCCTGACAT GCAGATTTCC ACCCAGAAGA CAGAGAAGGA 180 

GCCAGTGGTC AIXSGAATGGG CTCGGGTCAA AGACTGGGTG CCTGGGAGCT GAGGCAGCCA 240 

CCGTITCAGC CTCGCCAGCC CTCTGGACCC CGAGGTTGGA CCCTACTGTG ACACACCTAC 300 

45 CATGCGGACA CTCTTCAACC TCCTCTGGCT TGCCCTGGCC TGCAGCCCTG TTCACACTAC 360 

CCTGTCAAAG TCAGATGCCA AAAAAGCCGC CTCAAAGACG CTGCTGGAGA AGAGTCAGTT 420 

TTCAGATAAG CCGGTGCAAG ACCGGGGTTT GGTGGTGACG GACCTCAAAG CTGAGAGTGT 480 

GCTTCTTCAG CATCGCAGCT ACTGCTCGGC AAAGGCCCGG GACAGACACT TTGCTGGGGA 540 

TC?rACTCGGC TATOTCACTC CATGGAACAG CCATGGCTAC GATCTCACCA AGGTCTTTGG 600 

55 GAGCAAC?rrC ACACAGATCT CACCCGTCTG GCTGCAGCTG AAGAGACGTG GCCGTGAGAT 660 

GTTTGAGGTC ACGGGCCTCC ACGACGTGGA CCAAGGGTGG ATGCGAGCTG TCAGGAAGCA 720 

TX^CCAAGGGC CIGCACATAG TGCCTCGGCT CCTGrTTTGAG GACTGGACTT ACGATGATTT 780 



60 



wo 98/54963 



300 



PCT/US98/11422 



CCGGAACGTC TTAGACAGTG AGGATGAGAT AGAGGAGCTG AGCAAGACCG TGGTCCAGGT 840 

GGCAAAGAAC CAGCATTTCG ATGGCTTCGT GGTGGAGGTC TGGAACCAGC TGCTAAGCCA 900 

5 GAAGCGCGTG ACCGACCAGC TGGGCATGTT CACGCACAAG GAGTTTGAGC AGCTGGCCCC 960 

CGTGCTGGAT GGTTTCAGCC TCATGACCTA CGACTACTCT ACAGCGCATC AGCCTGGCCC 1020 
TAATGCACCC CTGTCCTGGG TTCGAGCCTG CGTCCAGGTC CTGGACCCGA AGTCCAAGTG " 1080 

10 

GCGAAGCAAA ATCCTCCTGG GGCTCAACTT CTATGGTATG GACTACGCGA CCTCCAAGGA 1140 

TCCCCGTCAG CCTGTTGTCG GGGCCAGGTA CATCCAGACA CTGAAGGACC ACAGGCCCCG 1200 

15 GATGGTGTGG GACAGCCAGG YCTCAGAGCA CTTCTTCGAG TACAAGAAGA GCCGCAGTGG 1260 

GAGGCACGTC GTCTTCTACC CAACCCTGAA GTCCCTGCAG GTGCGGCTGG AGCTGGCCCG 1320 

GGAGCTGGGC GTTGGGGTCT CTATCTGGGA GCTGGGCCAG GGCCTGGACT ACTTCTACGA 1380 

20 

CCTGCTCTAG GTGGGCATTG CGGCCTCCGC GGTGGACGTG TTCTTTTCTA AGCCATGGAG 1440 

TGAGTCAGCA GGTGTGAAAT ACAGGCCTTC ACTCCGTTAA AAAAAAAAAA AAAAAAAAAA 1500 

25 AAAAAAAAAA AAAAA 1515 



30 (2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 704 base pairs 

(B) TYPE: nucleic acid 
35 (C) STRANDEI»JESS : double 

(D) TOPOLOGY: linecUT 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

40 AAGA-TCGTGG CGCCCAGAGC TTCGCTCTAT GCTGCTCCCC TGAGAGAGGC GTTTCCATCA 60 

ACCAGTTTTG CAAGGAGTTC AATGAGAGGA CAAAGGACAT CAAGGAAGGC ATTCCTCTGC 120 

CTACCAAGAT TTTAGTGAAG CCTGACAGGA CATTTGAAAT TAAGATTGGA CAGCCCACTG 180 

45 

TTTCCTACTT CCTGAAGGCA GCAGCTGGGA TTGAAAAGGG GGCCCGGCAA ACAGGGAAAG 240 

AGGTGGCAGG CCTGGTX^CC TTGAAGCATG TGTATGAGAT TGCCCGCATC AAAGCTCAGG 300 

50 ATGAGGCATT TGCCCTGCAG GATGTACCCC TGTCGTCTGT TGTCCGCTCC ATCATCGGGT 360 

CTCCCCGTTC TCTGGGCATT CGCGTGGTGA AGGACCTCAG TTCAGAAGAG CTTGCAGCTT 420 

TCCAGAAGGA ACGAGCCATC TTCCTGGCTG CTCAGAAGGA GGCAGATTTG GCTGCCCAAG 480 

55 

AAGAAGCTGC CAAGAAGTGA CCCTTGCCCC ACCAACTCCC AGATTTCAAA GGAGGTAGTT 540 

GCAAAAGOrc TGCCCAAGGG GAGGAAGGAG GTCACACCAA TATGATGATG GTTTTCATGA 600 

60 CTTTGAATGA TATATTTTTG TACATCTAGC TGTATCGAGG CATCAGGCCT GAATAAACAT 660 
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CXTTTCTTAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAA 



704 



10 



15 



20 



25 
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(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1094 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDKESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

GGCAGCTTTC TTACAAACCC ATCCTTCTGA AATGTTGCTT CAAATTCATC CTCTGCTCCC 60 

CAGTCCCACT ATTCCACACA TACTGTTACT GTTTCTTTAT CCTACTTTCT CAATTTTGGA 120 

ACATAGTTGC AGTTACTGCA TTGAATACCT GTGGGTTTGC CTGTTGTTCT GTCTGTCTCT 180 

GTCGTTCTTG TAATANTGGA TCCCAGAGAT AAAATGGACA GTTGTNATGC ACAGTTAATT 240 

CAGAAACTAG ACCTTACTTG CTGTGTGAAA TACCAACTAA ATTCTCAGTG AACTCAGCTG 300 

ANCTTTATCT CCTTTTGTTT CCCCAATTTA TAATTTCAGT TCAGGCCCAG AAAGATGGAA 360 

TCCCAGCTAA GAAATACAAG TTACACCCTG TACTAGCAGC CCATGTGTGC ATGTTCTTTA 420 

AGTGCTCTTG CAGCTATGTC ATTTATATTG ATTTCCCTGT ATTATTATAA GCAAAGCAAA 480 

TTTGAGGAAA AAAACCCATA ATACCACACC TCATTTTTTT CAAGTAATAG GGTCATAAGT 540 

CTCATYCTYC ATATAATATG TTGAGTATGC AGTATATTAT GTGTTAGGCT CTGGANAGGC 600 

AGAGGTTAGA TCATGTVIACA GATQATATCK GATTAGGCAG ATAAACAGTA TTTTAACCTT 660 

TTCCTTATTA TATGTAACTT GCTTTCAGGT TTTTTAATGT TACTATTATG TCTTTAATAT 720 

ATTATCTTTA TTTGTACTTT TGTATACAGA GTGATTTTCC TTTTTTAAAA AAAATTGTGT 780 

CTTTAGGATG GATTCCAAAG ATGTGGAATC AGTAGGTTTA AGGAATATGG ATATTTTGGC 840 

TGGCAAGGTG GCTCACACCT GTAATCCCAG CACTTTGGGA GGCTGAGGTG GGTGGATCAC 900 

CTGAAGTCAG GAGITCGAGA CCAGCCTGAC CAACATGGCG AAACCCTGTT TNTACTAAAG 960 

ACACACWWAA AATTRGCCAG TGGTGGTGGC ATGTGCTTGT AGTCCCACTT AGCTACTCGA 1020 

GAGGCTGAGG CAGGAGAATC GCTTGAACCC QGGAGGCAGA GGTTGCAGTG AGGCAAGATG 1080 

GCACCTCTAC ACTC 1094 



60 



(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 
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■A) LHTKTTH: 1321 base pairs 

'3) Tr?Z: nucleic acid 

*:C) STrJ^iTDEE^'ESS : double 

:d) TC?CL0GV: linear 

•':ci) S^'SzZTZZ 3SSCRIPTICN: SEQ ID NO: 43: 



TGGCTTPiGGC CXIC^JZZCTZ CCCITGC-CTG GAACTACTGG ACAGACCCTT TTGAGATGTG 
CCTGTGGTGC TGTSGAGATG IGTGTAGTGG TCTTAGCTCT TTGTTGAGCT TGTGTGTGTG 
TTKrrGTACrrC rTA^CTGTA? 3CTGAAATTG GGCGTGTGTT GGAGGGCTTC TTAGCTCTTT 
GGTa^ATTG TATTTCTATG rGnTGTATC ASCTGAATGT TGCTGGAAAT AAAACCTTGG 
TTTGIWJ^r-G CTCiTTTTTC- TGGGAAGTAA GTAGGGGAAA AGGTCTTTGA GGGTTCCTAG 

GCTCCTTrrcrr acaacaggaa aatgcctcaa agccttgctt cccagcaacc tggggctggt 

TCCCAGTGCC TGG7CCTGCC CCTTCCTGGT TCTTATCTCA AGGCAGAGCT TCTGAATTTC 
AGGCCTTCAT TCC-jSAGCCC TdTGTGGCC AGGCCTTCCT TTGCTGGAGG AAGGTACACA 
C-GGTGA.-j:-CT GATGCTCrrAC TTGGGGGATC TCCTTGGCCT GTTCCACCAA GTGAGAGAAG 
GTACTTACTC ZTCrACCTCC TGTTC^CA GGTGCATTAA CAGACCTCCC TACAGCTGTA 
GGAACTACTG TCCCAGAGCT GAGGCAAGGG GATTTCTCAG GTCATTTGGA GAACAAGTGC 
TTT---2TAG-'A GTTTAAAGTA GTAACTGCTA CTGTATTTAG TGGGGTGGAA TTCAGAAGAA 

ATTTGAAGAC c>j:^-::x:?':rQG ■gtggtctgca tgtgaatgaa caggaatgag ccggacagcc 

TGGCrGTCi.T TGCTTrTCTTC CTCCCCATTT GGACCCTTCT CTGCCCTTAC ATTTTTGTTT 
CTCC^TCTAC CACCATCCAC CAGTCTATTT ATTAACTTAG CAAGAGGACA AGTAAAGGGC 
CCTCTTC-GCT TGATTTTGCT TCTTTCTTTC TGTGGAGGAT . ATACTAAGTG CGACTTTGCC 
CTATCCTATT TGGAAATCCC TAACAGAATT GAGTTTTCTA TTAAGGATCC AAAAAGAAAA 
ACAAAATGCT AATGAAGCCA TCAGTCAAGG GTCACATGCC AATAAACAAT AAATTTTCCA 
GAAGAAA7GA AATCCAACTA GACAAATAAA GTAGAGCTTA TGAAATGGTT CAGTAAGGAT 



CACTCAGGCT GGAC^TGCAGT GGTATGATCT TGGCTCACTG TAACCTCCGC CTCCCGGGTT 
CAAGCCA'rrC TCCTGCCTCA GTCTCCTGAG TAGCTGGGAT TACAGGTGCG TGCCACCATG 
CCTGGCTAAT TTITGTCTTT TTAGTAGAGA CAGGGTTTCA CCATGTTGGT CGGGCTGGTC 
TCAAACTCCT GACCTCTTGA TCCGCCTGCC TTGGCCTCCC AAAGTGATGG GATTACAGAT 
GTGAGCCACC CGrTGCCCTAG CCAAGGATGA GATTTTTAAA GTATGTTTCA GTTCTGTGTC 
ATGGTTGGAA GACAGAGTAG GAAGGATATG GAAAAGGTCA TGGGGAAGCA GAGGTGATTC 
ATGGCTCrGT GA^.TTTGAGG TGAATGGTTC CTTATTGTCT AGGCCACTTG TGAAGAATAT 




I ' l ' i ' - ' lTO ri T TGTTTTGTTT TGKTTTTTTA AAGACGGAGT CTCGCTCTGT 
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GAGTCACrrA TTGCCAGCCT TGGAATTTAC TTCTCTAGCT TACAATC5GAC CTTTTGAACT 1680 

GGAAAACACC TTGTCTCCAT TCACTTTAAA ATGTCAAAAC TAATTTTTAT AATAAATGTT 1740 

TATTTTCACA TTCAAAAAAA AAAAAAATTT AAAAACYCGG GGGGGGCCCS CTACCCCATT 1800 

NGCCCCTAAG GGGGGGGGTT T ^^^^ 

(2) INFORMATION FOR SEQ ID NO: 44: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1024 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

20 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 
GGGGCACAGT TGAAGAAGCG ACCGAGGGAC TGGGAGTCGT TAGTGAGGAT GACGCGGCAT 60 
25 GGCAAGAACT GCACCGCAGG GCCGTCTACA CCTACCACGA GAAGAAGAAG GACACAGCGG 120 
CCTCGGGCTA TGGGACCCAG AACATTCGAC TGAGCCGGGA TGCCGTGAAG a^^CTTCGACT 180 
GCTGTTGTCT CTCCCTCCAG CCTTGCCACG ATCCTGTTGT CACCCCAGAT GGCTACCTGT 240 
ATGAGCGTGA GGCCATCCTG GAGTACATTC TGCACCAGAA GAAGGAGATT GCCCGGCAGA 300 
TGAAGGCCTA CGAGAAGCAG CGGGGCACCC GGCGCGAGGA GCAGAAGGAG CTTCAGCGGG 360 
35 CGGCCTCGCA GGACCATCTG CGGGGCTTCC TGGAGAAGGA GTCGGCTATC GTGAGCCGGC 420 
CCCTCAACCC rrrCACAGCC AAGGCCCTCT CGGGCACCAG CCCAGATGAT GTCCAACCTG 480 
GGCCCAGTGT GGGTCCTCCA AGTAAGGACA AGGACAAAGT GCTGCCCAGC TTCTGGATCC 540 
CGTCGCTGAC GCCCGAAGCC AAGGCCACCA AGCTGGAGAA GCCGTCCCGC ACGGTGACCT 600 
GCCCCATCTC AGGGAAGCCC CTGCGCATGT CGGACCTGAC GCCCGTGCAC rrCACACCGC 660 
45 TAGACAGCTC CGTGGACCGC GTCGGGCTCA TCACCCGCAG CGAGCGCTAC GTGTGTGCCG 720 
TGACCCGCGA CAGCCTGAGC AACGCCACCC CCTGCGCTGT GCTGCGGCCC TCTGGGGCTG 780 
TGGTCACCCT CGAATGCGTG GAGAAGCTGA TTCGGAAGGA CATGGTGGAC CCTGTGACTG 
GAGACAAACT CACAGACCGC GACATCATCG TGCTGCAGCG GGGCGGTACC GSTTCGCGGG 
CTCCGGAGTG AAGCTGCAAG CGGAGAAATC ACGGCCGGTG ATGCAGGCCT GAGTGTGTGC 960 
55 GGGAGACCAA ATAAACCGGC TTCGGTGCGC AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1020 
AAAA 
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(2} INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 983 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

CGACACGGCT GCGAGAAGAC GACAGAAGGG CCCGACCGCG AGCCGTCCAG GTCTCAGTGC 60 

TGTGCCCCCC CCAGAGCCTA GAGGATGTTT CATGGGATCC CAGCCACGCC GGGCATAGGA 120 

15 

GCCCCTGGGA ACAAGCCGGA GCTGTATGAG GAAGTGAAGT TGTACAAGAA CGCCCGGGAG 180 

AGGGAGAAGT ACGACAACAT GGCAGAGCTG 1TTGCGGTGG TGAAGACAAT GCAAGCCCTG 240 

20 GAGAAGGCCT ACATCAAGGA CTGTGTCTCC CCCAGCGAGT ACACTGCAGC CTGCTCCCGG 300 

CTCCTGGTCC AATACAAAGC TGCCTTCAGG CAGGTCCAGG GCTCAGAAAT CAGCTCTATT 360 

GACGAATTCT GCCGCAAGTT CCGCCTGGAC TGCCCGCTGG CCATGGAGCG GATCAAGGAG 420 

25 

GACCGGCCCA TCACCATCAA GGACGACAAG GGCAACCTCA ACCGCTGCAT CGCAGACGTG 480 

GTCTCGCTCT TCATCACGGT CATGGACAAG CTGCGCCTGG AGATCCGCGC CATGGATGAG 540 

30 ATCCAGCCCG ACCTGCGAGA GCTGATGGAG ACCATGCACC GCATGAGCCA CCTCCCACCC 600 

GACTTTGAGG GCCGCCAGAC GGTCAGCCAG TGGCTGCAGA CCCTGAGCGG CATGTCGGCG 660 

TCAGATGAGC TGGACGACTC ACAGGTGCGT CAGATGCTGT TCGACCTGGA GTCAGCCTAC 720 

35 

AACGCCTTCA ACCGCTTCCT GCATGCCTGA GCCCGGGGCA CTAGCCCTTG GACAGAAGGG 780 

CAGAGTCTGA GGCGATGGCT CCTGGTCCCC TGTCCGCCAC ACAGGCCGTG GTCATCCACA 840 

40 CAACTCACTG TCTGCAGCTG CCTGTCTGGT GTCTGTCTTT GGTGTCAGAA CTTTIGGGCC 900 

GGGCCCCTCC CCACAATAAA GATGCTCTCC GACCTTCAAA AAAAAAAAAA AAAAAAAAGR 960 

KGSGGCCGGT CCCCANTCCC CCC ^83 

45 



(2) INFORMATION FOR SEQ ID NO: 46: 

50 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2421 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
55 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

CCGGCTGATC GCTGCCGCTC CQCCAATACA ATAGAGCCAK CCACTACCAG CAGCCTGGCC 60 

60 
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CTCTTCCTCC TTCTCCAGAG AGACCAATCC 
GAGGAAGTGA CCATGGACAC AAGTGAAAAC 
5 ATCCCTATTG CAGACCAAGT CAGCAATGAT 
GAGAAGAAAG AGAGCTCGCT GCCCAAATCA 
ACCAAGGGGG TGCCAGCTGG AAACAGTGAC 

10 

CGCTGGGGAG CCAGCACAGC CACCACACAG 
TCACTAAAGA GCCTCATCCC CGACATCAAA 
15 CTTCATGCTG ATGACTCTCG CATCTCTGAG 
ACCCATGACA AGGGGCTGAA AATATGCCGG 
CAGGAGAATG GGCAGAGGGA AGAAGAGGAA 

20 

GTACCTCCCC AGGTGTCAGT AGAGGTGGCC 
AAAGTGACrr TAGGAGATAC CTTAACTCGA 
25 TCCATTACCA TTGATGACCC AGTCCGAACT 
ATTAGCAACA TTGTCCATAT CTCCAATTTG 
GAGTTGTTGG GGCGCACAGG AACCTTGGTG 

30 

TCTCATTGCT TTGTAACGTA CTCAACAGTA 
CACX3GGGTCA AATGGCCCCA GTCCAATCCC 
35 GATGAGCTGG ATTATCACCG AGGCCTCTTG 
GAGCAGGGAA TACCACGGCC CCTGCACCCC 
CACCCCCGGG CAGAGCAGCG GGAGCAGGAA 

40 

GAACGGGAAA TGGAGCGGCG GGAGCGGACT 
GTTCGAGAAG GGCCCCGTTC CCGATCAAGG 
45 AAGTCTAAAG AAAAGAAGAG TGAGAAGAAA 
CTGCTGGATG ACCTTTTCCG AAAGACCAAG 
ACTGACAGCC AGATCGTTCA GAAAGAGGCA 

50 

AAGCGGCGAA AGGAGCAAGA AGAAGAAGAG 
GAACGGAACC GACAGOTGGA GCGAGAGAAA 
55 GAGAGAGAGA GAGAAAGGGA GCGGGACAGG 
CGAGAACGAG GCAGGGAAAG GGATCGCAGG 
CGGAGCACAC CTGTGCGGGA CCGGGGTGGG 

60 



AGCCGAACTC GGGGTTTGCC TGAGGAGAAG 120 

AGACCTGAAA ATGATGTTCC AGAACCTCCC 180 

GACCGCCCGG AGGGCAGTGT TGAAGATGAG 240 

TTCAAGAGGA AGATCTCCGT TGTCTCAGCT 300 

ACAGAGGGGG GCCAGCCTGG TCGGAAACGA -360 

AAGAAACCTT CCATCAGTAT CACCACTGAA 420 

CCCCTGGCGG GGCAGGAGGC TGTTGTGGAT 480 

GATGAGACAG AGCGTAATGG CGATGATGGG 540 

ACAGTCACTC AGGTAGTACC TGCAGAGGGC 600 

GAAGAGAAGG AACCTGAAGC AGAACCTCCT 660 

TTGCCCCCAC CTGCAGAGCA TGAAGTAT^ 720 

CGTTCCATTA GCCAGCAGAA GTCCGGAGTT 780 

GCCCAGGTGC CCTCCCCACC CCGGGGCAAG 840 

GTCCGTCCTT TCACTTTAGG CCAGCTAAAG 900 

GAAGAGGCCT TCTGGATTGA CAAGATCAAA 960 

GAGGAAGCTG TTGCCACCCG CACAGCTCTG 1020 

AAATTCCTTT GTGCTGACTA TGCCGAGCAA 1080 

GTGGACCX3TC CCTCTGAAAC TAAGACAGAG 1140 

CCACCCCCAC CCCCGGTCCA GCCACCACAG 1200 

CGGGCAGTGC GGGAACAGTG GGCAGAACGG 1260 

CGATCAGAGC GTGAATGGGA TCGGGACAAA 1320 

TCCCGTRACC GCCGCCGCAA GGAACGTGCG 1380 

GAGAAAGCCC AGGAGGAACC ACCTGCCAAG 1440 

GCAGCTCCCT GCATCTATTG GCTCCCACTG 1500 

GAGCGGGCCG AACGGGCCAA GGAGCGGGAG 1560 

CAAAAGGAGC GGGAGAAGGA AGCCGAGCGG 1620 

CGTCGGGAGC ACAGTCGGGA GAGGGACAGG 1680 

GGGGACCGAG ATCGGGATAG GGAAAGGGAC 1740 

GACACCAAGC GCCACAGCAG AAGCCGGAGT 1800 

CGCCGCTAGC TGGGAAAACA CTAGAGCTGC 1860 
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AGGTACCAGC CACTCGGCCC CAGGGGGTTA TGGCCACAGA GGGATAGGO^ CAGTCTCC-.C 1920 

CACCCTGGAG CCAAGGGTCT TTCACATCAC CTATCCCTAC ATACATACCA AATGGAAAAG 1930 
5 TCGCCATCCT TTTCCCCCCA AACACACCCC CTTAACCTAT CTCTTGGGi^r TTAGCCCGAC 

CCTCCCTCTC ATTTCCCATT AAGTCTGAGA GGCAAGAGCT AGGTTAGGC^ AGGAGGTGGT 2100 

TGGCCAGAGA TGGGGAACAG CCAGGTGCCC CAGTCCTCTG ATTTTTCCTC CATCCTGCTT 2150 

10 

ACCACCTCCC TGGGTACTTA CAGCCTTCTC TTGGGAACAG CCGGGGCCAG GACTGGGTCA 2220 

CCTATCAGCT GAATCAGCAT CTCCTCCTGA GTCCCAGGGC CCCTGCAGTT CCCAGTCTCT 2230 

15 TCTGTCCTGC AGCCCTTGCC TCTTTCCCAC AGGTTCCACT 1TATATCCAC CTTTTCCTTT 2340 

TGTTCAATTT TTATTTTTAT TTTTTTTATT ATTAAATGAT GTGGTCTATG GAAAAAAAAA 2400 

TAAAAATCTG ACTrAGTTTT A 2421 

20 



(2) INFORMATION FOR SEQ ID NO: 47: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 840 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
30 (D) TOPOLOGY; linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

CTCAAACTCC TGAGCTGAAG CGATCTACCT GCCTCAGCTA GGATTACAGG TGTGAGCC-.C 50 

35 

CGCACCCAAC CTCAATAAGC KTATTTGATA AAAKATATGC AAGCTCCCTT TATKCACTTT 120 

TCATTCAGAA TGTTTAGTAA TTTGTATTGT TTTTCAGATT TTCAGCCCAA TATATCTCC/ 130 

40 TGCCCACTGT GTCACTGTAT TCTACCTAWA CATCATCACG TGTTTCTGCT ATTGGCTGTA 240 

TGATGGAACA CTGCGGCTCA TTTTCCTGAA AACTGCCGAT AGTGCATAGA RTGCTGGGAT 300 

GGAAACCAGA ARCTTTGAAT TCAAGCCTTG GTTCTGCCTT GTTnTGCTT GGGTGGCCiT 3 SO 

45 

GAGTCAGCCA CATACCTTTT AAAATCTCAA TTTATTAGAA ATTATTCCAA ATCAAAATCA 420 

AATGAGAAGG TATATACAAA AGTGCTITAT CCCACAATAA ACTATTCAAG AGAGAGCAAA 480 

50 GGAGAGGACA TTTACTCAAC ACCTCCTAAA AGGCAGCCAG TGAAATTAC<; CATTTTAnT 540 

AATCCTCCTG GCAACTCTGA GAGTAAAGCA TTATTAATCC CATITTGGCT GTTTAAAGAA 500 

ATTATTTCCA CTAGATTCCA GCTGTAGTTT AGYTTCAGAA AAAAAAATCC TGAGATGTGA 560 

55 

ATTCACAGCT TTCTCGGTTT AAAGCCCAAG CTCTATCACA TCATGCTATT ATTGTTACA? 720 

TACTGCTAGT TCTATGAAAA GAAATACTAA TTTATGAAAT ACATCTTATC GAAAAAAAAA 780 

60 AAAAAAAAAC TGGGAGGGGG GGCCCGTACC CAAATCGCCG GATAGTGATC GTAAACAATC 340 
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5 (2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2432 base pairs 

(B) TYPE: nucleic acid 
10 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 
15 GGCACGAGGC CCGGAACGCT GAGGAAGGGC CCGTCCCGCC TTCCCCGGCG CGCCATGGAG 

CCCCGGGCGG TTGCAGAAGC CGTGGAGACG GGTGAGGAGG ATGTGATTAT GGAAGCTCTG 120 
CGGTCATACA ACCAGGAGCA CTCCCAGAGC TTCACGTTTG ATGATGCCCA ACAGGAGGAC 180 
CGGAAGAGAC TCGCGGASTG CTGGTCTCCG TCCTGGAACA QGGCTTGCCA CCCTCCCACC 240 
GTGTCATCTG GCTGCAGAGT GTCCGAATCC TGTCCCGGGA CCGCAACTGC CTGGACCCGT 300 
25 TCACCAGCCG CCAGAGCCTC CAGGCAYTAG CCTGYTATGY TGACATCTCT GTCTCTGAGG 360 
GGTCCGTCCC AGACTCCGCA GACATGGATG TTGTACTGGA GTCCCTCAAG TGCCTGTGCA 420 
ACCTCGTGCT CAGCAGCCCT GTGGCACAGA TGCTGGCAGC AGAGGCCCGC CTAGTGGTGA 480 
AGCTCACAGA GCGTGTGGGG CTGTACCGTG AGAGGAGCTT CCCCCACGAT GTCCAGTTCT 540 
TTGACTTGCG GCTCCTCTTC CTGCTAACGG CACTCCGCAC CGATGTGCGC CANAGCTGTr 
35 TCAGGAGCTC AAAGGAGTGC GCCTGCTAAC TGACACACTG GAGCTGACGC TGGGGGTGAC 

TCCTGAAGGG AACCCCCCAC CCACGCTCCT TCCTTCCCAA GAGACTGAGC GGGCCATGGA 720 
GATCCTCAAA GTCCTCTTCA ACATCACCCT GGACTCCATC AAGGGGGAGG TGGACGAGGA 780 
AGACGCTCCC CTTTACCGAC ACCTGGGGAC CCTTCTCCGG CACTGTGTGA TGATCGCTAC 
TGCTGGAGAC CGCACAGAGG AGTTCCACGG CCACGCAGTA ASCCTCCTGG GGAACTTGCC 
45 CCTCAAGTGT CTCGATOTTC TCCTCACCCT QGAGCCACAT GGAGACTCCA CGGAGTTCAT 



600 
660 



840 
900 
960 



GGGAGTGAAT ATGGATGTGA TTCG?rGCCCT CCTCATCTTC CTAGAGAAGC GTTTGCACAA 1020 

GACACACAGG CTCAAGGAGA GTCTAGCTCC CGTGCTGAGC GTGCTGACTG AATGTGCCCG 1080 

GATGCACCGC CCAGCCAGGA AGTTCCTGAA GGCCCAGGTG CTGCCCCCTC TGCGGGATGT 1140 

GAGGACACGG CCTGAGCTTTG GGGAGATGCT GCGGAACAAG CTTGTCCGCC TCATGACACA 1200 

55 CCTCGACACA GATGTGAAGA GGGTGGCTGC CGAGTTCTTG TTTGTCCTGT GCTCTGAGAG 1260 

TCTGCCCCGA TTCATCAAGT ACACAGGCTA TGGGAATGCT GCK5GCCTTC TGGCTGCCAG 1320 

GGGCCTCATG GCAGGAGGCG GCCCGAGGGC AGTACTCAGA GGATGAGGAC ACAGACACAG 1380 

60 
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308 

ATGAGTACAA GGAAGCCAAA GCCAGCATAA ACCCTGTGAC CGGGAGGGTG GAGGAGAAGC 1440 

CGCCTAACCC TATGGAGGGC ATGACAGAGG AGCAGAAGGA GCACGAGGCC ATGAAGCTGG 1500 

5 TGACCATCTT TGACAAGCTC TCCAGGAACA GAGTCATCCA GCCAATGGGG ATGAGTCCCC 1560 

GGGGTCATCT TACGTCCCTG CAGGATGCCA TGTGCGAGAC TATQGAGCAG CAGCTCTCCT 1620 

CGGACCCTGA CTCGGACCCT GACTGAGGAT GGCAGCTCTT CTGCTCCCCC ATCAGGACTG 1680 

10 

GTCCTCCTTC CAGAGACTTC CTTGGGGTTG CAACCTGGGG AAGCCACATC CCACTGGATC 1740 

CACACCCGCC CCCACTTCTC CATCTTAGAA ACCCCTTCTC TTGACTCCCG TTCTGTTCAT 1800 

15 GATTTGCCTC TGGTCCAGTT TCTCATCTCT GGACTGCAAC GGTCTTCTTG TGCTAGAACT 1860 

CAGGCTCAGC CTCGAATTCC ACAGACGAAG TACTTTCTTT TGTCTGCGCC AAGAGGAATG 1920 

TGTTCAGAAG CTGCTGCCTG AGGGCAGGGC CTACCTGGGC ACACAGAAGA GCATATGGGA 1980 

20 

GGGCAGGGGT TTGGGTGTGG GTGCACACAA AGCAAGCACC ATCTGGGATT GGCACACTGG 2040 

CAGAGCMANT GTKTTGGGGT ATGTGCTGCA CTTCCCAGGG AGAAAACCTG TCAGAACTTT 2100 

25 CCATACGAGT ATATCAGAAC ACACCCTTCC AAGGTATGTA TGCTCTGTTG TTCCTGTCCT 2160 

GTCTTCACTG AGGGCAGGGC TGGAGGCCTC TTAGACATTC TCCTTGGTCC TCGTTCAGCT 2220 

GCCCACTCTA GTATCCACAG TGCCCGAGTT CTCGCTGGTT TTCGCAATTA AACCTCCTTC 2280 

30 

CTACTGGTTT AGACTACACT TACAACAAGG AAAATGCCCC TCGTGTGACC ATAGATTGAG 2340 

ATTTATACCA CATACCACAC ATAGCCACAG AAACATCATC TTGAAATAAA GAAGAGTTTT 2400 

35 GGACAAAAAA AAAAAAAAAA AAAAAAAAAA AA 2432 

40 (2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1742 base pairs 

(B) TYPE: nucleic acid 
45 (C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTIC^: SEQ ID NO: 49: 

50 GTCCTGCAGG AGCTCCACGC GGCCGAGGTG CGCANGAACA AGGAGCAGCG AGAAGAGATG 60 

TCGGGCTAAG GGCCCGGSAC GRGSGGCGCC CATCCTGCGA CGGAACACGT TCGGGTTTTG 120 

GTTTTCTTTC GrTTCACCTCT GTCTAGATGC AACTTTTGTT CCTCCTCCCC CACCCCAGCC 180 

55 

CCCAGCTTCA TGCTTCTCTT CCGCACTCAG CCGCCCTGCC CTGTCCTCGT GGTGAGTCGC 240 

TGACCACGGC TTCCCCTGCA GGAGCCGCCG GGCGTGRAGA CGCGGTCCCT CGGTGCAGAC 300 

60 ACCAGGCCGG GCGCGGCTGG GTCCCCCGGG GGCCCTGTGA GAGAGGTGGY GGTGACCGTG 360 
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15 



309 



GTAAACCCAG GGCGGTGGCG TGGGATCRCG GGTCCTTACG CTGGGCTGTC TGGTCAGCAC 420 

G1GCAGGTCA GGGCAGGTCC TCTGAGCCGG CGCCCCTGGC CAGCAGGCGA GGCTACAGTA 480 

5 

CCTGCTGTCT TTCCAGGGGG AAGGGGCTCC CCATGAGGRA GGGGCGACGG GGGAGGGGGG 540 

TGATGGTCCC TGGGAAGCCT GCKTGTGCAN CCGGTGCTTG TTGAACTGGC AGGCGGGTGG 600 

10 GTGGGGGCTG CAGCTTTCCT TAATGTGGTT GCACAGGGGT CCTCTRAGAC CACCTGGCGT 660 

GAGGTOGACA CCCTQGGCCT TCCTGGAAGC CTGCAGirGG GGGCCTGCCC TGAGTCTGCT 720 

GGGGAGTCGG CATTCTCTGC CAGGGACCCA TGAGCAGGCT GCATGGTCTA GAGGTTGTGG 780 

GCAGCATGGA CAGTCCCCCA CTCAGAAGTG CAAGAGTTCC AAAGAGCCTC TGGCCCAGGC 840 

CCCTCCGTGG GACAGCCCCG CCGCCCCTCC CCACCAGGGC TTTGCAGATG TCCTTGAAAG 900 

20 ACCCACCCTA GAGCCCTTTG GAGTGCTGGC CCCTCCTGTG CCCTCTGCCC TGGTGGAAGC 960 

GGCASCACAA GTCCTCCTCA GGGAGCCCCA AGGGGGATTT TKTQGGACCG CTGCCCACAG 1020 

ATCCAGC?rGT TGGAAGQGCA GCGGGTAAGG TTCCCAAGCC AGCCCCAACA CCCITCCCAC 1080 

TTGGCACCCA GAGGGGGCTG TGGGTGGAGG CCTGACTCCA GGCCTCTCCT GCCCACACCC 1140 

TCTGGGCTGA GTTCCTTCTT TCCCTTGGAC GCCCAGTGCT GGCCTTGGAG GACGGTCAGC 1200 

30 TGGAGGATGG CGGTGGGGGA GGCTGTCTTT GTACCACTGC AGCATCCCCC ACTTCTCCAC 1260 

GGAAGCCCCA TCCCAAAGCT GCTGCCTGGC CCCTTGCTGT AAAGTGTGAA GQGGGCGGCT 1320 

GAGTTCTCTT AQGACCCAGA GCCAGGGCCC TCAACTTCCA TCCTGCGGGA GGCCTTGGCC 1380 

35 

GGGCACTCCC AGTGTCTTCC AGAGCCACAC CCAGGGACCA CGGGAGGATC CTGACCCCTG 1440 

CAGGGCTCAG GGGTCAGCAG GGACCCACTG CCCCATCTCC CTCTCCCCAC CAAGACAGCC 1500 

40 CCAGAAGGAG CAGCCAGCTG GGATGGGAAC CCAAGGCTGT CCACATCTGG CTTTTGTGGG 1560 

ACTCAGAAAG GGAAGCAGAA CTGAGGGCTG GGATATTCCT CATGGTGGCA GCGCTCATAG 1620 

CGAAAGCCTA CTGTAATATG CACCCATCTC ATCCACGTAG TAAAGTGAAC TTAAAAATTC 1680 

AATCAAATGA ACAATTAAAT AAACACCTGT GTGTTTAAGA AAAAAAAAAA AAAAAAACTG 1740 
CG 



25 



45 



50 



(2) INFORMATION FOR SEQ ID NO: 50: 



55 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1487 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : do\jble 

(D) TOPOLOGV': linear 

60 



1742 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

GGCACGAGCC TCCGCGAACT GTGGAGTCGG CGGAGGGCTG GAATCAGCGT GGGCTCCAGG 60 

5 TCGCTCGCAG CCGGGTGGCA GAACTCTTCC GAGGCTCCTT GGGAAGAAGC TACACCCGAG 120 

GGAGCCGGAT GGGCCTCGAA AACCTGGCCC GCTCTGGTTC TGTACCATTG CAAGGGGAAC 180 

CGTAAACTGA GCTTTTCTAA CGTGGGTTTC TGCCAAGTAC TTTTCCAGCT GCCCCCTTCC ' 240 

10 

CCCCAGCACA CAGGAGAGCC TCTGTGTAGC CAGCGCTTGA CAGTCGTTAG GTAGGTTGTA 300 

CTGTGTAGGG AGGAGCTCAA GATCATGAAT GGTTGTCACA GGAGAAAGCG GTTGCATCTT 360 

15 TGCAAAACTA TATACCTGCT GTGGTTTGTG TTTTCTTTTC TGCTGAGTAA TGAAGTTGTA 420 

AGTTCACACT GGCACATTCT CAGGGCTGTG CAGATTATTT GCACTTTATT TCATAGGTGR 480 

ATAAGTGCTT TTTAGCTTTC TTTGTATATT GAGTTGCTTT TGAATTGCTT CCCATATTTT 540 

20 

TATTTCATAC AAACTGAACA ATTGTGGCCC CTCTAnTTA TTTATAAAGG TTCAGTGTAT 600 

CTTTGCCTGC CTACATCAAT CTGCAAGGGA GTTGCAGAAA GCCTCATGTT CATCGAGCCG 660 

25 TGAGTCACAA CCAATTTCTA AGCTGTTATA ACAAAAAAGT GTTTGCTTTT TTTCACAAGT 720 

AACTTTAAAA GTGTAGTTTA GAAAGAAAAC ATTTTCAATA AAAAGACACT ACATTAATCC 780 

TGGATGCTTG CAAATCCTAA AATMTATTCC TCCTCTAGCG TTGCACAGCT CTGTGTTGTA 840 

30 

TACACAGACT AGCTTTAAAA TTTGTCACAT ACCACTTTAC CTTTACTTTT ATGTATCATT 900 

CCCCCGACTT CCTTACTGCA GGTGTGGGCA AGAAAACTTT TCCTTTAACA CmTCAACA 960 

35 GCGGGCATAA AATTCTGCAG CTGAGGTCTT GAAGAATGCA GATGGGTACA GTATGTGTTG 1020 

GAGCTCACAG TGTGTAITGA CTAACCTAGT TCCTTTTTTG CITTTTTTGG TATTGTCTTG 1080 

TTAAAAGTGA CTCCCAGGTA GCAACTCTCT TTTTTAAGGG TGGGAACGAA AGGGACGTAG 1140 

40 

GAAGAATAGA TCTAGATTAT TTAACAGTCT TCGATAGAGT TTGAAAGCTT TCTTCTTCAT 1200 

TCAATTTTGG GCAAAATACT GCCTCTGCAT TTGTTCATAA CAAAAAGATT AGATTAATAA 1260 

45 GTAGCTTTTG TTGGTGGAAA TTACCAGCTC TATAAGTCAC CCTTGGTGGT TCATGGACCT 1320 

CTGATTAGCT TGGGTnTGC AGTCTCATTG CCACATGTAT ATGTGGAGCC AATGGCCTTT 1380 

TGGTGCTCAG CTGTTTACGT CTGACTCCTT GACTTCTTTG GTACAGTGAT GGAGTCAGAT 1440 

CTCATTAAGT GrGATTCTCC ATGGATATAA CCAGCCCCAA AAAAANG 1487 



50 



55 



(2) INFORMATION FOR SEQ ID NO: 51: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1328 base pairs 
60 (B) TYPE; nucleic acid 
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(C) STRANDEENESS ; doiable 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

5 

GGCACGAGCT CC3TGCCGAAT TCGGCACGAG AGAAGATTTG AAGAAGCCAG ATCCAGCTTC 60 

CCTGCGGGCT GCTTCTTGTG GGGAAGGGAA AAAGAGGAAG GCCTGTAAGA ACTGCACCTG 120 

10 TGGCCTTGCC GAAGAACTGG AAAAAGAGAA GTCAAGGGAA CAGATGAGCT CCCAACCCAA 180 

GTCAGCTTGT GGAAACTGCT ACCTGGGCGA TGCCTTCCGC TGTGCCAGCT GCCCCTACCT 240 

TGGGATGCCA GCCTTCAAAC CTGGGGAAAA GGTGCTTCTG AGTGATAGCA ATCTTCATGA 300 

15 

TGCCTAGGAG GTTCCTGACA TGGGACCCAT CTGCTCCTCC AGCCAACTCC TGTCCCTCAC 360 

ATCCCACCAT GGTGGCTCCT CCCACCTCCT CTGGATTTGT TCACTCTGAG ATCTGTTTGC 420 

20 AGAGTGGGTG CTTAGCAGAC AGAGTGAAGC TGGCTGGGGG GCACAGTGGT GTGTAGTGCT 480 

GCTGTGTATC AAAAGACCAA GGTATTATGG GACCTGGTTT CAGAATGGGA TGGGTTTCTT 540 

CACCTCATGT TAAGAGAAGG GAGTGTGTCC TGAAGAAGCC CTTCTTCTGA TGTTAAAATG 600 

25 

CTGACCAGAA CGCTCTTGAG CCCAGGCATC GTTGAGCATT AACACTCTGT GACAGAGCTG 660 

CAGACCCCTG CCTTGAGTCT CATCTCAGCA ATGCTGCCAC CCTCTTGTCT TTCAGAGTTG 720 

30 TTAGTTTACT CCATTCTTTG TGACACGACT CAAGTGGCTC ACAACCTCCT CAGGGCACCA 780 

GAGGACTCAC TCACTGGTTG CTGTGATGAT ATCCAGTGTC CCTCTGCCCC CTTCCATCCC 840 

CAACCACATT TGACTCTAGC ATTGCATCTG TGTCCTGTTG TCATTTATGT TAACCTTCAG 900 

35 

GTATTAAACT TGCTGCATAT CTTGACATAT CTTGAGATTC TGCATGTCTT GTAAAGAGAG 960 

GGGATGTGCA TTTGTG'TCTG ATGTTGGATA GTCATCCACG CTCAGTTTGG ACCATTGGAG 1020 

40 GAACTTAGTG TCACGCACAA ATGGGGCTAT TCCTACGCTT AGAATAGGGC TTGTCTGCCC 1080 

ACTTTAGAAG AGTCCCAGGT TGGTGAGCAT TTAGAGGGAA GCAGGGCAGA ACTCTGAACG 1140 

ACAATACGTC TCTCTGAGCA GAGACCCCTT TGTTCTTGTT ATCCACCCAT ATGGACTTGG 1200 

45 

AATCAATCTT GCCAAATATT TQGAGAGATT GTGTGGATTT AAGAGACCTG GATTTTTATA 1260 

TTTTACCAGT AAATAAAAGT TTTCATTGAT ATCTGTCCTT GAAAAAAAAA AAAAAAAAAA 1320 

50 AAACTCGA 1^28 



55 (2) INFORMATION FOR SEQ ID NO: 52: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1856 base pairs 

(B) TYPE: nucleic acid 
60 (C) STRANDEI»IESS : doiible 
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(D) TCPOLCG-if; lir^ear 
(xi) SEQUEXE rescript:::?: SZQ ZD !:C: 52: 
GAATTCGGCA CGAC-CTCTGC ATG.---.CITZ-Z .-. 

CCTAGATTAA ATTCCCCGGG CTGAAArTGA Grr7GC;^.3.-.r: T 
TGCTGTCTTC AATTAAACCA TTTATGACC'. TAACTAATTr T 
TTTCCAGGCC TTCCITCTTT GTACAA.---Jr: A.---~GTCC.-" ; 
TTCAAACATG ATGCTAATTT AAATTA--:?:.-- CTTCCTATG.^. : 
TTTGCCACTG TTArTAGTTC TCrCAA.-----.r AC-.TCTAGGC- 
TTTGATTATC TTTCTATCTC TTTTATTTAr TTCrrC-.rTr A : 
GGTTGGCATT GATACAGTAA ATTTGT.-AA.r GAGGAGACA.-. t 
TGrTGCTTAAT GACTGTAGCA GAATSCCTtT TC-CTAJLArC ; 
TAGTITGATA GATTTGCAAG CTATGCTSr? TCCArOAP-GT : 
CAGGCTTCTT TGTCTCTGGT TGCAGCTTC-C ATGATCjCCr ; 
CGGAGATCAC AAArCAGGCC CT?GGT7rA3 TTGCrAGTTT : 
GCAGAAACTG ACCTCACTGG GCAAGGCrTC-G CC-.TG-:^^Cr ; 
GTGTTCAGGA AGCC^CAGGC CArATr:a-.C TCr^ASAPA:; 
ACAAAGTATA ACA-.CCCCrT AA3ATAZATC TArTTTA; 



ATACCATTGG CCAATTACAA GATAAAA-.TG TTCAArTTCr 
TGTCTTTTCA TCTCITGCTA nTATA_mG TC-.CT'ITrr.-j; 
GAGGAAGGAC TTTC-CTGCAC TTACTGTArC AC-.TCA-iAC-. 
CrmTAAAA AATGTTATTC TGAtTATAA-:: AATAArATTr- 
GCCACCTTGC AAGGTTTAGT GAGATTZArG GAA:77rG.-Ar 
TAGCTCCAAA AATTTGCGAA GCAAAAGCTA GCCCCAAT?:- 
TTAACAGATT TGCiTTTGAA GTGACTCCAS ACATTAGGTI 
AAAGAGGAAT AAA5ACATCT YTTCTCrCTA GA-AA:^TA-. 
CCCACTTTCA TTGA3ATCAG CTTGTCTa=-r AACCT^ATAT 
TGATAATAGT GGTACTTTTG TAATTTTC^CT GG-GCATTTA 
TTCAYCTTTT CTYCGAACAT YCCTATiCCT AGATG2AG77 
AACTGTCCTA ArrTTTGTTG TCTACCTTGA TGCCCCTTTT 
ACAATTAAAT ATCACACTAT GACATArGAT TTA=J:rrAGai 



-^-CGAGGCr TCCGCTGCCC 60 

CAArATCA TATTTTAAAT 120 

:ca:?2atgtc gatgcatc-ct " i80 

L-AC-CGirrC ACTTATATTC 240 

rAJ?GT?ATTA TTCCTATGAT 300 

-A^AGGATTA TTTTAAGrPA 360 

rrTAAGAAAT TCGTTCCATT 420 

rA-AAAAAAT CTAAATTACT 480 

GATTGTCTr TCTTGCAGTT 540 

A^rTGCGCT GGTAGGAACG 600 

A.rrAGGC^G ACAACGTAGC 660 

C'GG.-.C-GTGC AGAGAGGTTG 720 

AZTCTTTAA TGCACTCTAT 780 

-AAATAAGAG GAAAAACCCC 840 

ra-AA-TTAA-T TTTTCAGT^ 900 

TTAA^iATCC TTTGTTGACT 960 

rCAACAAAGr: CTTATTTGCT 1020 

Cr':^GG'GAGGG TGGTGTTTAA 1080 

GCV.'-'ITICA TGAAAAG^JGC 1140 

.^jrrrAAGCAG gaattgctgc 1200 

GTTTGGAACT TTGAAACTGA 1260 

CA2A2ATTAG TTAAAAATAG 1320 

aACC?.CAArr AATAATCCTT 1380 

aAGTGTGAClA ATGATAAACA 1440 

AOAAIATAJGT AAAKGATGAG 1500 

::Axx:rcAA-AT tgggaattat 1560 

GrrrTAATAC CCACAGTGTA 1620 

TATTTTAAAJG ATAAATTTTA 1680 
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GGGGTAAATG TTTACTTCAA AATGACTCCA TATTTCAAAT ATCTGTTrAG ACTGTGAAGG 1740 

CCAAATAATT TTTAAGAAAA CATTTGAAGA CTAGTGTGTT TGCATTTGTG AATAATCTTA 1800 

5 CTCACAGCAA GTAAACGTAA TAAAAGCCAA CATTTAAGCC AAAAAAAAAA AAAAAA IBS 6 

10 (2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1558 base pairs 

(B) TYPE: nucleic acid 
15 (C) STRANDEIMISS : doxible 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

20 TGGGTATCCA TTCCTGNAAT TACTTTACTT AGGATAATGG CCTCCAGCTC CGTCCAAGTT 60 

GCTGCAAAAG GTATTATTTC GTTCCTTTTT GTGGCTGAGT AGTATTCCAT GGTGTATATA 120 

TACCACATTT TCTTTATCCA CTCATTGCTT GATGGGCAGT TAGGTTGGTT CCACATCTTT 180 

25 

GCAATTGTGA GTTGTGCTGC TCCAGATATC ATCTTTAACT CCTTTGCCTT CTCCACATAC 240 

ATTTCCAAGT CCTGTTCATT CTACCTCCAA AATGTATCTT GTATCCATTC ATCTCTCTCC 300 

30 ATCTTCAATC TATTTCAATG CCCCATCATC TCTTGCATGG AGGAGTGTAA TAATTGGCTA 360 

ACTGGCCTGT TCTTACATTT TAAAATCAAA AGATGTGACA GGTGAAATGC CTATTTCAGT 420 

GTCCATTGAT GGTTCTGCTT ACACACCACC TGGCTGCCTG GTGTCGCAGT GGCAGAGTTG 480 

35 

AGCAGTGTGA AAAAGACTGC TTGGCCCTTT ACAGGGAAAG CAGGTCCACT GTGGCCTGTG 540 

AGGACGAGAG CTCTGGGCAG GCTCGGACAC TGGCAGACCC TGGTCCTGGC TGGCCAAGGC 600 

40 AGCAGGGTAT GTGTTTCGGG TCACTCACAG GGCTCAGCAC CACTCCTCAT GGCTTCCTTA 660 

CTGTTTCGGC AGAGGCTGAC CCGCGGCTGA TTGAGTCCCT CTCCCAGATG CTGTCCATGG 720 

GCTTCTCTCA TGAAGGCGGC TGGCTCACCA GGCTCCTGCA GACCAAGAAC TATGACATCG 780 

45 

GAGCGGCTCT GGACACCATC CAGTATTCAA AGCATCCCCC GCCGTTGTGA CCACTTTTGC 840 

CCACCTCTTC TGCGTGCCCC TCTTCTGTCT CATAGTTGTG TTAAGCTTGC GTAGAATTGC 900 

50 AGGTCTCTGT ACGGGCCAGT TTCTCTGCCT TCTTCCAGGA TCAGGGGTTA GGGTGCAAGA 960 

AGCCATTTAG GGCAGCAAAA CAAGTGACAT GAAGGGAGGG TCCCTGTGTG TGTGTGTGCT 1020 

GATGrrrCCT GGGTCCCCTG GCTCCTTGCA GCAGGGCTGG GCCTGCGAGA CCCAAGGCTC 1080 

55 

ACTGCAGCGC GCTCCTGACC CCTCCCTGCA GGGGCTACGT TAGCAGCCCA GCACATAGCT 1140 

TGCCTAATGG CTTTCACTTT CTCTITTGTr TTAAATGACT CATAGGTCCC TGACATTTAG 1200 

60 TTCATTATTT TCTGCTACAG ACCTGGTACA CTCTGATTTT AGATAAAGTA AGCCTAGGTG 1260 
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TTGTCAGCAG GCAGGCTGGG GAGGCCAGTG TTGTGGGCTT CCTGCTGGGA CTGAGAAGGC 1320 

TCACGAAGGG CATCCGCAAT GTTGGTTTCA CTGAGAGCTG CCTCCTGGTC TCTTCACCAC 1380 

5 

TGTAGTTCTC TCATTTCCAA ACCATCAGCT GCTTTTAAAA TAAGATCTCT TTGTAGCCAT 1440 

CCTGTTAAAT TTGTAAACAA TCTAATTAAA TGGCATCAGC ACTTTAACCA AAAAAAAAAA 1500 

10 AAAAAAAAAA AAANAAAAAA AAAAGGGGGC CGCTCTAGAG GTCCAAGTTA NGACG^fGG 1558 

15 (2) INFORMATION FOR SEQ ID NO: 54; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 948 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

25 TAAAAATCAT GCTCTGTACC ATCCTCACCG TAGTCATCAT CATCGCCGCG CAGACCACGA 60 

GAACTACTGG GATCCCTAAA AACGCCCCTG GTCCGGCCCC ACTCTGCGCC CCTCGATCTC 120 

CCAGGCTCTT TCTGCAGWCA TACCGCGGAC CCAATGGGCG CCCTGCACAC CCGTTTCTGG 180 

30 

GGCCGTCAGA CTTGGATACA TCGTAAACTC CGCCTCCACG GAACGTCTCG CCTKGCGAGC 240 

AAGMTCGGAA TCCAGTTCCT CAGGAACCCC TCCAAAACCC ACACCCCCAG GGACGCCGCT 300 

35 TTCCGGGATC CCGGSCAAAC GCCGGACCCT CAGTCGCTCC AGGCCCCCTC ACCCTCAAAG 360 

TGTAGCGCCC CCAACCGAGC AACCTCGGTT TGGTCCCTAA AACCCCGCCT CCTCTATAAG 420 

CACCGCCCCA GCTCTGACAA AACCCCGCCT CCAQGTCGGC AGGCTCCGCT TCTTTTCTTC 480 

40 

TCCGCGGGGT GATTCAGTCC AGTGATTGGG TTTGTGGCTC CAGGCCTCGC CCACAGACGG 540 

ACAGACCCCT CCCTTTCTTC CGGCAAAAGG ACCGAGCCCT GGGGTAGTAA GGSCCCCACA 600 

45 CTCCTGTTTT TTGCAAGTAC ATTTTTGTCC YTCCTCCACC CAGGTATCTG CCTATTTTCT 660 

TGCTAATCCC AGAACCTTTC CTTTTGCTTT TTTTAAGGAC ATTTGGGAAG TTCCTGGTGT 720 

AGGACCCTTC TCCCTGGGAT AAGAAACCTG CCTGTAAACG CTCTGTAAAT ACTCCCTTCC 780 

50 

ACCCATCCCA GCCCCTGGGC AGCCGGGCAG AAGGGAATCC AGGCTATGGA CCTCCCAAGT 840 

CCCCGCTCCC CGCTCCCCTC GGCGGCCCCG CCTTGTTCTG ATCTGTGTGT GAGTGTGTGT 900 

55 GAACTTCTGA AAGACAATAT TAAAGAGACT TAGTTGAAAA AAAAAAAA 948 

60 (2) INFORMATION FOR SEQ ID NO: 55: 
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ii) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 990 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 
GGGGAACTGC AGTGACAGCA GGAGTAAGAG TGGGAGGCAG GACAGAGCTG GGACACAGGT 
ATGGAGAGGG GGTTCAGCGA GCCTAGAGAG GGCAGACTAT CAGGGTGCCG GCGGTGAGAA 
TCCAGGGAGA GGAGCGGAAA CAGAAGAGGG GCAGAAGACC GGGGCACTTG TGGGTTGCAG 
AGCCCCTCAG CCATCTTGGG AGCCAAGCCA CACTGGCTAC CAGGTCCCCT ACACAGTCCC 
GGGCTGCCCT TGGTTCTGGT GCTTCTGGCC CTGGGGGCCG GGTGGGCCCA GGAGGGGTCA 
GAGCCCGTCC TGCTGGAGGG GGAGTGCCTG GTGGTCTGTG AGCCTGGCCG AGCTGCTGCA 
GGGGGGCCCG GGGGAGCAGC CCTGGGAGAG GCACCCCCTG GGCGAGTGGC ATTTGYTGCG 
GTCCGAAGCC ACCACCATGA GCCAGCAGGG GAAACCGGCA ATGGCACCAG TGGGGCCATC 
TACTTCGACC AGGTCCTGGT GAACGAGGGC GGTGGCTTTG ACCGGGCCTC TGGCTCCTTC 
GTAGCCCCTG TCCQGGGTGT CTACAGCTTC CGGTTCCATG TGGTGAAGGT GTACAACCGC 
CAAACTOTCC AGGTGAGCCT GATGCTGAAC ACGTGGCCTG TCATCTCAGC CTTTGCCAAT 
GATCCTCACG TGACCCGGGA GGCAGCCACC AGCTCTGTGC TACTGCCCTT GGACCCTGGG 
GACCGAGTGT CTCTGCGCCT GCGTCGGGGG NAATCTACTG GGTGGTTGGA AATACTCAAG 
TTTCTCTGGC TTCCTCATCT TCCCTCTCTG AAGGACCCAA GTCTTTCAAG CACAAGAATC 
CAGCCCCTGA CAACTTTCTT CTGCCCTCTC TTGCCCCANA AACAGCANAA GCAGGANANA 
NACTCCCTCT GGCTCCTATC CCACCTCTTT GCATGGGAAC CTGTGCCAAA CACCCAAGTT 
TAAGAAAAAA ATAAAACTGT GGCATCTCCA 

(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENCTTH: 1603 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 
GGTCGACCCA CGCGTCCGGC CCGCCGGCTC CGGAGCGGCT CTGCCTTCCC GAGCGCGGGA 
CCGCGCCCTG GQGGAGGAGG GCGAACGACG CGGCGATGGC TCCGCGGGCA CTCCCGGGGT 
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CCGCCGTCCT AGCCGCTGCT GTCTTCGTGG GAGGCGCCGT GAGTTCGCCG CTGGTGGCTC 180 

CGGACAATGG GAGCAGCCGC ACATTGCACT CCAGAACAGA GACGACCCCG TCGCCCAGCA 240 

5 ACGATACTGG GAATGGACAC CCAGAATATA TTGCATACGC GCTTGTCCCT GTGTTCTTTA 300 

TCATGGGTCT CTTTGGCGTC CTCATTTNGC CAMCTOGCTT NAAGAAGAAA GGCTATCGTT 360 

GTACAACAGA AGCAGAGCAA GATATCGAAG AAGAAAAAGG iTGAAAAGWr AGRATTGAAT " 420 

10 

GACAGTGTGA ATGAAAACAG TGACACTGTT GGGCAAATCG TCCACTACAT CATGAAAAAT 480 

GAAGCGAATG CTGATGTYTT AAAGGCGATG GTAGCAGATA ACAGCCTGTA TGATCCTGAA 540 

15 AGCCCCGTGA CCCCCAGCAC ACCAGGGAGC CCGCCAGTGA GTCCTGGGCT TTGTCACXIAG 600 

GGGGGACGCC AGGGAAGCAC GTCTGTGGCC ATCATCTGCA TACGGTGGGC GGTGTOGTCG 660 

AGAGGGATGT GTGTCATCGG TGTAGGCACA AGCGGTGGCA CTTTATAAAG CCCACTAACA 720 

20 

AGTCCAGAGA GAGCAGACCA CGGCGCCAAG GCGAGGTCAC GGTCCTTTCT GTTGGCAGAT 780 

TTAGAGTNAC AAAAGTGGAG CACAAGTCAA ACCAGAAGGA ACGGAGAAGC CTGATGTCTG 840 

25 TTAGTGGGGC TGAAACCGTC AATGGGGAGG TGCCGGCAAC ACCTGTGAAG AGAGAACGCA 900 

GTGGCACAGA GTAGCAGGTG AGCCGTGGTT TTGGTGACAT TGGGGGCAGA GTGGTGCAGG 960 

GTCAGGAGAA GGTACTTGGA GCCTCCCAGG TGCTGTQGCA GCATAGGAAT GGTATTTGAC 1020 

30 

AGGGAAGTGG GAGAGCTTTC CTTGACCCAG GAAGACTGAG GGGGACTGAA CATGATTACT 1080 

TGTCTGCCTA GAGCTTCTTG TAAAGAAGTC ACAAACTTAG TGCCTCCAGG GGCTTGGCTG 1140 

35 TGTGATAATG AGGATAGAGG ATTACTTGTG AGGCAATGTG GCATGGTGGG GATTGTGGCA 1200 

AACTAGAATT CACATCACCC ACCATATAGG GCTTGCATTA CCACGAGGCA GAAAGCACCT 1260 

AGTGTTGCTG CATCTTCTTA CGCAAAAAAG ACAAAATCCA GACTTCTAAA ATGTAAAATC 1320 

40 

ACTGATTTTC GATATTGGCA GCTTACTTTT TTTTTTTAAA CAACCATGCA GGCCAAATGA 1380 

CTTGTAATCT TGTCACCATT TTTAGGTAAA CTGTGACTTG AAAAAGTCTG GAGCAAACAA 1440 

45 ACCAATGCTT TTTCCTTTTA TTCTGTTGGR AACCAGTTTT CTTTGTGTCA CAGTTYTGAA 1500 

ACCTCAATAC GAATATTTCT CTTCCCACCA AATATTTTGA GGCAATTGAA AAGCCACAGT 1560 

GATTTATTTC TTGATTTGGC AATTTTAATT TTGCAAGACA ATT 1603 

50 



(2) INFORMATION FOR SEQ ID NO: 57: 

55 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1052 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
60 (D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

TACAGCTCAG GATGCCTGTA ACATTGTCAT CTCTGGGCTT CTGGGTCCTG CTTAGCCTGC 60 

5 

TTTTTCCCTG GAGGACTGAC CAGGGATGCG GCCCAGCAAC ATGTTACTAA ATCATACTCT 120 

CCTCCCTACC TTTCCCAGAC CTCTCACTCC TGCCTGGTGT TCCAACCCGT TCTGTCGCCA 180 

10 GAGTATACAT TTTGGAACCT CTTCGAGGCC ATCCTGCAGT TCCAGATGAA CCATAGCGTG 240 

CTTCAGCAGN AAGGCCCGAG ACATGTATGC AGAGGAGCGG AAGAGGCAGC AGCTGGAGAG 300 

GGACCAGGCT ACAGTGACAG AGCAGCTGCT GCGAGAGGGG CTCCAAGCCA GTGGGGACGC 360 

15 

CCAGCTCCGA AGGACACGCT TGCACAAACT CTCGGCCAGA CGGGAAGAGC GAGTCCAAGG 420 

CTTCCTGCAG GCCTTGGAAC TCAAGCGAGC TGACTGGCTG GCCCGTCTGG GCACTGCATC 480 

20 AGCCTGAATG AGGCTGGCCA CCTGCCACTT TGCCCTGCCC TCTGCCTCCA GGGCTCCMCT 540 

MYCCTTCCTT TTCTTGGTGA AAGGCACCTC CTTTCCTGAT AATGAATGGT GTTCCCTTTG 600 

CTTGGCTGGG GAGCCCCCCA GGCCAGGTTT GCTGGCCATA GATACCTTTG GGCTGCCTGR 660 

25 

GACAGGCTCC TGAGGAGGAT TGAGGGTGAA AGTCTCCCAC GAGTACACTA AACCTAGGTC 720 

TGGTCACCAA TAGGGTTTGG AGAGCAAAGG GCCACAACTC ATCAGCTGCC TGTCTCTTAG 780 

30 ATGCACTTTC TTTTTCCACC AGCACATCCT TCAACACACA GAATTTCAGG GAAGAGTTCT 840 

CCCCAAAACC CTAGCTCTTT ACCCTTCCAT TTTAGCCTTC CACCCAGCTT CCACAAAAGA 900 

TTTGGCTCTA CCTTGGATCT GCTAGTAAAT AACTAATAGG CAGGCAGTTA TTTGGGTAAG 960 

GAAAAAAGGG GTGGGAGAGA CAGAAAATTT GCCCACTGCT GCTCCTCCCC TTGGSTYTCC 1020 

ACCTGGGATT TGCTATTGAA TCTCTACCCT NN 1052 



35 



40 



(2) INFORMATION FOR SEQ ID NO: 58: 

45 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 814 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linocu: 

50 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 
ACNCGNTGGC GGCCGCTCTA GAACTAGGGG ANCCCCCGGG CTGCAGGAAT TCGGCACGAG 60 
55 CATAGACTTT TAAACTGGTA CGGTTCTTAG AGATGGTCCT TGGCCTTCTG TTGTTGTTGT 120 
KGTTTTTTTC TTTTTCTTCT TCTCCTTCTC CTTCTTCTTC TCTTCTCCTT CTTTCTTCTT 180 
TTTTTTTTCA GAG1CTTGCT CTGTCACCAA GACTGGAGTG AAGTGATGTG ATCTCGGCTT 240 

60 
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ACTGCAACCT GGGAGGCAGA GGTTGCAGTG AGTCGAGATG GTGCCATTGC TCTCGTTTGG 300 

GCAACAAGAG TGAAACTCTT GTCTCAAAAA AAAAAAAAAA' ATGAGGTTTA AGACAGTTTr 360 

GTCATTACTG GTGGGATCTG GTCACACAAG ATAGCATTAA ACGTGACATG GCACATAAAA 420 

TTGC3TTAAAA AATTTTGTTT TTTAATTACG TAATGTAAAA GCCCAACAAA CACTTTATGC 480 

AAGATTGGAA TGTATCTTCA AATTCAGATT TAATAAACAT GTAAAGATCC TCTGTATATA 540 

AAAGTTGTAT TTAATCCCTT GTGCCCCAAG AATGCTATAA AAGATCCCAA GAATGTTATC 600 

TATGAAAAGA TAGCAATAGG GAATGGTGAA CAAATAATTT AATTTGCCAA TTCTAAAAAA 660 

15 CATGGACTTA AACCCCATGA AAACTTGGTT CCATAGTTTT AACTGTTTTA TGGTTCCAAT 720 

ACAAAACCAG AGTGGTTTAC ATTCCACAAT NACCAAATTT GCATCCAATN TTGGGGTAAT 780 

TTTNGGTATT TGCCATGGGA TACTATTCAT TTTT 814 

20 



10 



25 



35 



(2) INFORMATION FOR SEQ ID NO: 59: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1215 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
30 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

AGAGGAAGTC TTTTGCCAAG CCTGTTCTCT GGACTAACGC CATCCAGGCT GGGAGGGGAA 60 

GAGTGCTCTG CTACACTCGT CCCCCTCCTG CCTCATCTTC CTTCTCAGCC TTCGTTCCTG 120 

ATGGGAACAG AATGGAGGGC CTGAGAACAT ACTTTCTAAA TGCCTTTGAC CCAGGAACCG 180 

40 ATTATCTATA TTTGTTCCCA TTTTCCTTCA CCGTGACATT CCAGCATTGT CTGACTGTGA 240 

GGTGGGCCTT TGAGAGCCTC CAGGTTCCTC AAAACAGGCC TGAGCGATGG GCATCACACC 300 

CTCTGCCTAC CCACRTGCCT GCTTACCTGC CAGATAACCA AGTC3IAGATG TCTGCGAGTG 360 

45 

GCTAGmrC ACATTCTTAC TAGTGTTTGG YTCACCTTTG GGCAAAGGCC CCCTCTAGGC 420 

CTTGCCCCAC CTCCATCAAA CGCAGACACT GTAGTCAGAC CTCAGYAATA TAGGAGGCAA 480 

50 TAATCTTTTA ACAGTGTTTT GCAAACAAAC AAAAAGAGAA AAATCCCAGC CAGGGGAACT 540 

CGCCACCTGC CCACGCTAGT TCCATCCACG CTCAAGACCC GCCCTTAGAC CAGGCAGGCA 600 

AAGGCCCCCA TCACACTCGG CCACTAGTGG GGTCCTGAGG CCAAGAAAGA AACCAGACCC 660 

TGTATGACAA GTTGGGKTCT TTCCAGAACA CGACAGAAAC AGGGGGGGCC CCTTGTTAAT 720 

GCCACTCCAT ACTCCAGAAG CATTATTCCT TATTTGGGAC AGCCAAGGGC AGATTCACAG 780 

60 GTTATTGTAG GAATAAAGAC TAGTTTACAA AGGARAAAGA GSCCCTGGAC TTCCCMAGGA 840 
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AAGGTCAGGT TAGGGCTCCT GTACCCATTC TGTTCCACCA CTGTTTGATC TCTCTGGCCT 
CCCACCAGGA ATGCCGTTTC CTTTTTATGG ATCTGTTGGG AACCAGAGAG AATCAACAGA 
TCAATGACAT AGGATCCGAA GTGCAATGAT AGTCACTTCT AGTTTGGCAT TTCACAAACT 
CTGNACAGCA AGGTATTGGT AGGTTACTCA ATTTCAAAAG GGCCCCATGG CCAAATATGT 
TTAGGAACCG CTGTTTGNAT TTCTTTTTTT GGAGACGCAT TGTATATAAT ATATGTCAAA 
GGCTTTCGGA ATTCCTGCAG GAAAGAAATC AGCTTTGTTA AATCCNAAAA AAAAAAAAAA 
AAAAAAATAG ACTCG 

(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 478 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 
ATTTCTTATG ACATGGGGGT TTGAATTGGT TGGCAAATGT TTAATTTTAA TATCCATAAT 
CAGTGAGGTC CTGCTGGCTG TAATCATTAA TTGTGAAATC TAAGGAGCTT AGTTCATGGC 
TCTAGAATTT CACAGAAAAR TGYGMTATGA TACGAGCATT AAGTTTATTT CTTCTGATCT 
TTGATGCAGC TTTGTTCAGT TTATCTGTTT TTGTATTTAT TGGTCATCTA CTTCCCATGC 
CAAAAGGGAC TGGTCTACAT AGCTGCGCTA AACACCTGAT CAAATCACTA AAAGAAAATC 
TGTTACCTCT AATGAATTAT CCTGATTGTA AGTTAAAAAT CAATATTTCC CCGTAGTGAG 
GTTTGCTTTT TAAAAAGAAK KCTTAAAAAA AAAAAAAAAA AAACGAGTTN AAGAAAAGGA 
AGCAAGCTCA GGTAAGGTGC ACACATTGGG CTAAGGAAGC TAGAGCCTGT GGAGANGC 

(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 618 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 
TATGACCTTG ATAACCCCAA GTTNGAAATT AACCTTCANT AAAGGGAACA AAAGCTGGAG 
TTCGCGCGCT TQCAGTTCGA CACTAGTGGA TCCCAAAGAA TTCGGCACGA GTCATAAtGA 
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GCTACTAGGT AAGCCTTCTG GGACTTTCAG ATATTTTGGG GAAGATTGAT TTTTGTTCTT 
ACATGCTGTG GACCCTTGGC CATCAAATGG TATGGGGAAG CTCATCCGTC TGTCTGTCAT 
GGTCATGTCA GTCAGGCGTC TTTTTAGTAT TTACTGGGTG CTCAGTACTG TGCCAGATGC 
TGTCGGGAGC CGTGGTGGTA TGGAGGAGGA GTGCTCCAGA GGACTCTGCT GTGTCGCAGG 
CCAGCATAAA CAAGCCAAGG GGAAAAGGCA GGCATGGAAT AAAGGGGGAG AATACCAGIG 
TGTGACTTAC TGCTGACTGT GTGGATTAGC CTATCAGCAG TAATCAAGCA GGGCGGAGGG 
CATTATCTTT GAGCCAGAAG AGTGAGCACT GGSCCGAGGG TGGAGCATCA AGAGGGGGTG 
TAGGACCNCA AGGCTTCTTN CNGGGGAGAC AACGTCAATA AGCNGTCAGT AGTCACCGAC 
AGTTTTGGGA AGCAAGGG 

(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 751 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 
TCGACCCACG CGTCCGAGGA GCTGGACTTC TGAGACAGCC ATTCTCCTTG CATAGCACIG 
TCTGCTGCTA CAGCTCATAG AAGTCAACAA TTTTCTTCAA CACTGGTAGG CAGCCTCTAA 
ATGGCCCTGA TCACCCTCAC CTCCTGCCAT TCACACCNNT GTAAAATTCC ACCCCTGGAC 
CTAGTGACTC ACTTCTAACA ANGAGAATAC AGCAAAAGTA ACATCGCTTC TGAGGTGAGG 
CTACAAGGAG ACTACGATGC CTGCCTTGGT CACCCTTCTC CTGCTCTTTC CATTGCTCCC 
TCTGATGGAA GCCAGTTGCC ATGTGATGAG GTGCCCTATG GAGAGGCCCA CGTGACAAGG 
TATTGTAAAA AGCCTCTGAC CAATAGCCAT CTAGAAACGG AGGCCCAGTC CAGCAGCCTC 
TGAGATGAAT CCTGCCAACC TGAGCTTGGA GACAGATTCT CTCCCTATCC IGCCTTGGGA 
TGATCACAGC CACCACCAAC ACCTTCACTG CCTGGTGAGA GGCCAAGCCA GTGAACCCAA 
GGTAAACTGG ACAGAATCCT GACCCACAGA AACTGAGATA ATGTTTGTTA TTTTAAGCTG 
CTCAGnTGT TACAGAGCAA TAGATAACTA ACTCAAACAC CATAAAATTC TAATATTTTA 
TTCTATCACA CAAACCAGGT AATACCAAGT AAATGCCATT ACTATACACA TATTTTTGTA 
ACACAATTAC ATGTGATTTT TTAAGAAGGC T 
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60 



(2) i::?C?:-IATICN for SZZ. id NO: 63: 

(i) SEQUZr-TCE C-I^^-'ACTERISTICS: 
5 (A) L^:GTH: 780 case pairs 

(3) Tr?Z: nucleic acid 
(C) STRAIIDECtJESS : double 
{D) TOPCICC/: linear 

10 (Xi) SEQUEiiCE CE3CRIPTICM: SEQ ID NO: 63: 

CI^GMG^STCA CX3TCCCCGA TTCCCGGGTC GACCCACGCG TCCGGGTTGG CAACTCCTGA 
GGCCTGCATG GG?G?-jCrrcA Ci-TTITCCTA CCTCTCCTTC TAATCTCTTC* f AGAGCACCT 120 
GCTATCCCC^. ACTTCTAGAC CIGCTCCiAA CTAGTGACTA GGATAGAATT TGATCCCCTA 180 
^^^^^^^^ TGCC-GTGCTC ATTGCTGCTA ACAGCATTGC CTGTGCTCTC CTCTCAGGGG 240 

20 CAC-CATGCTA ACGGC-GCGAr G^CCTAATCC AACTGGGAGA AGCCTCAGTG GTGGAATTCC 300 
acc<:actgtg actgtcaagc tggcaagggc CAGGATTGGG GGAATGGAGC TC-GGGCTTAG 360 
CTGC<L::>GGTG GICTCAAGCA GACAGGGAAT GGGAGAGGAG GATGGGAAGT AGACAGTGGC 420 

25 

T<^:-I73GC? CrGAGGCTCC CIGGC-GCCTG CTCAAGCTCC TCCTGCTCCT TGCTGTTTTC 480 
Ta--TGA.Tr-G C-GGC-CriGC-G AGTCCCTTTG TCCTCATCTG AGACTGAAAT GTGGGGATCC 
30 AGGAT33CCT TCCTTCCTCT TACCCTTCCT CCCTCAGCCT GCAACCTCTA TCCTGGAACC 
TGTCCrCCCT TICrCCCCAA CTATGCATCT GTTGTCTGCT CCTCTGCA-AA GGCCAGCCAG 
CTTGGGAGC-. GCAGAGAAAT AAACAGCATT TCTGATGCCA AJU^AAAAAAA AAAAAAAACC 
GCGC-CCGAAA GCTTA-TTNCC CTTTAAGTAA GGGGTTAATT TTTAGCrTGG GCACTNGGCC 



540 
600 
660 
720 
780 



40 



50 



60 



(2) H^POFI-LATrON FOR SEQ ID NO: 64: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENG::H: 588 base pairs 
45 (3) TfPZ: nucleic acid 

(CI STRAiXiEDNESS : double 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 
TTCCGAATTA ATCGACTCAC TATAGGAAi^ GCCGTCGCCA TGACCCGCGG TAACCAGCGT 
GAGCZCGCCC C-CCAGAAGAA TATGAAAAJ^G CAGAGCGACT CGGTTAAGGG AAAGCGCCGA 120 



55 GATGACGGC-C TITCrC-CTCC CGCCCGCAAG CAGAGGGACT CGGAGATCAT GCAGCAQAAG 



60 



180 



CAiSAA-AAAC-3 CAAACQAGAA GAAGGAGGAA CCCAAGTAGC TTIGrGGCTr CGTGTCCAAC 240 
CCTCTIGCCC TTCGCCICTG TGCCTCGAGC CAGTCCCACC ACGCTCGCGT TTCCTCCTGT 30O 
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AGTGCTCACA GGTCCCAGCA CCGATGGCAT TCCCTTTGCC CTGAGTCTGC AGCGGGTCCC 360 

TTTTGTGCTT CCTTCCCCTC AGGTAGCCTC TCTCCCCCTG GGCCACTCCC GGGGGTGAGG 420 

5 GGGTTACCCC TTCCCAGTGT TTTTTATTCC TGTQGGGCTC ACCCCAAAGT ATTAAAAGTA 480 

GCTTTGTAAT TCCAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 540 

AAAAAAAAAA AAAAAAAAAA AAAANNCGGG GGGGGGCCCC CCCCTCCC '588 

10 



(2) INFORMATION FOR SEQ ID NO: 65: 

15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 774 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
20 (D) TOPOLOGFY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

TTTAAAGATG AAGAAATGAC AAGGGAGGGA GATGAGATGG AAAGGTGTTT GGAAGAGATA 60 

25 

AGGGGTCTRA GAAAGAAATT TAGGGCTCTG CATTCTAACC ATAGGCATTC TCGGGACCGT 120 

CCTTATCCCA TTTAATTAAT TTCTCTGACA ATTCAATTAT TTTCTGTTAT TAATGTTGCC 180 

30 ACTGCTTTCT GTTTGTCTGC ACTTTCTTGA TAAATATTTG CTATCGTTTT ACTCCAGTCA 240 

TTCGATGTTG CTGAGATTTA CATATGACTC TTGTCAACAT CTCATCTTTT GACCCAATCT 300 

TATTCATTTA ATAAGAGGTC TCATTCATTT GCATGGAAAA ATGCTCATTG TATATTGCAA 360 

35 

AGTGAAAATA ACGAGTTGCA AAACAGTGTA TACATATATG TGTGTATATA TGTACACTTT 420 

ATTTGTACAT TTCTATGTGA CATAATGCAA AGGAAAGTGT CTGATTTTAT TATACACCAA 480 

40 AGGTTAACAG TGAATCTCTG TGTGATCTCT TTTTTTTTCT TTTTGCCTAT CTGCATCTTC 540 

TCACTTGCCA AAAAATGAAT ATATGTTTAT GTGTGTATAT TACTTGTGTC ACAAAAAACC 600 

CTAAAGTAGA CAGTAAAAGA ACTTGTCAAT CGCCTTTGGA AGGCAATGAA ACACTTAATA 660 

45 

AACTCTCAAT AACAGAAGCG TAAAAATGAA ATGTAAACCT CCAATTACCT CTGGATCTCT 720 

TAGCCAGAGT AATAAACTGG TAATTATTAC AGATAAAAAA AAAAAAAAAA AANA 774 

50 



(2) INFORMATION FOR SEQ ID NO: 66: 

55 (i) SEQUENCE CHARACTERISTICS: 

(A) VENGTHi 1866 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

60 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 
ACCCACGCGT CCGGTCCTCT TCTTCAGCAC ATGCCAAAGC TGTTCCTCAC GGCCTGTCAG 60 

ACAAGAGCAT CTTGGATGTA GGACAATGGA AGAGTTAGAT GCCTTATTGG AGGAACTGGA 120 

ACGCTCCACC CTTCAGGACA GTGATGAATA TTCCAACCCA GCTCCTCTTC CCCTGGATCA 180 
GCATTCCAGA AAGGAGACTA ACCTTGATGA GACTTCGGAG ATCCTTTCTA TTCAGGATAA - 240 

CACAAGTCCC TTGCCGGCGC AOTCGTGTAT ACTACCAATA TCCAGGAGCT CAATGTCTAC 300 

AGTGAAGCCC AAGAGCCAAA GGAATCACCA CCACCTTCTA AAACGTCAGC AGCTGCTCAG 360 

15 TTCGATGAGC TCAIXSGCTCA CCTGACTGAG ATGCAGGCCA AGGTTGCAGT GAGAGCAGAT 420 

GCTGGCAAGA AGCACTTACC AGACAAGCAG GATCACAAGG CCTCCCTGGA CTCAATGCTT 480 

GGGGGTCTSG AGCAGGAATT GCAGGACCTT GGCATTGCCA CAGTGCCCAA GGGCCATTGT 540 

20 

GCATCCTCCC AGAAACCGAT TGCTGGGAAG GTGATCCATG CTCTAGGGCA ATCATGGCAT 600 

CCTGAGCATT TTGTCTGTAC TCATTGCAAA GAAGAGATTG GCTCCAGTCC CTTCTTTGAG 660 

25 CGGAGTGGCT TCGNCTACTG CCCCAACGAC TACCACCAAC TTTTTTCTCC ACGCTCTGCT 720 

TACTGCGCTG CTCCCATCCT GGATAAAGTG CTGACAGCAA TGAACCAGAC CTGGCACCCA 780 
GAGCACTTCT TCTGCTCTCA CTGCGGAGAG GTGTTTGGTG CAGAAGGCTT TCATGAGAAG 840 

30 

GACAAGAAGC CATATTGCCG AAAGGATTTC TTAGCCATGT TCTCACCCAA GTGTGGTGGC 900 
TGCAATCGCC CAGTGTTGGA AAACTACCTT TCAGCCATGG ACACTGTCTG GCACCCAGAG 960 

35 TGCrrTGTTT GTGGGGACTG CTTCACCAGT TTTTCTACTG GCTCCTTCTT TGAACTGGAT 1020 

GGACGTCCAT TCTGTCAGCT CCATTACCAT CACCGCCGGG GAACGCTCTG CCATGGGTGT 1080 

GGGCAGCCCA TCACTGGCCG TTGTATCAGT GCCATGGGGT ACAAGTTCCA TCCTGAGCAC 1140 

40 

TTTGTGTCTG CrrrCTCCCT GACACAGTTG TCGAAGGGCA TTTTCAGGGA GCAGAATGAC 1200 

AAGACCTATT GTCAACCTTG CTTCAATAAG CTCTTCCCAC TGTAATGCCA ACTGATCCAT 1260 

45 AGCCTCTTCA GATTCCTTAT AAAATTTAAA CCAAGAGAGG AGAGGAAAGG GTAAATTTTC 1320 

TGTTACTCAC CTTCTGCTTA ATAGTCTTAT AGAAAAAGGA AAGGTGATGA GCAAATAAAG 1380 

GAACTTCTAG ACTITACATG ACTAGGCTGA TAATCTTATT TTTTAGGCTT CTATACAGTT 1440 

AATTCTATAA ATTCTCTTTC TCCCTCTCTT CTCCAATCAA GCACTTGGAG TTAGATCTAG 1500 

GTCCTTCTAT CTCGTCCCTC TACAGATGTA 1TTTCCACTT GCATAATTCA TGCCAACACT 1560 

55 GGTTTTCTTA GGTTTCTCCA TTTTCACCTC TAGTGATGGC CCTACTCATA TCTTCTCTAA 1620 

TTTGGTCCTG ATACTTGTTT CTTTTCACGT TTTCCCATTT CCCTGTGGCT CACTGTCTTA 1680 

CAATCACTGC TGTGGAATCA TGATACCACT TTTAGCTCTT TGCATCTTCC TTCAGTGTAT 1740 

60 



50 
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T a ri CT TTTT CAAGAGGAAG TAGATTTTAA CTGGACAACT TTGAGTACTG ACATCATTGA 1800 

TAAATAAACT GGCTTGTGGT TTCAATAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1860 

5 AAAAAA 1866 

10 (2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1152 base pairs 

(B) TYPE: nucleic acid 
15 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

20 CTCAAGGATG TAAAGGCTCT GCAGATTTCG GGAGGCCTGT CTCCCAGCAC CTGATGGGAC 60 

ACTTTTTCCC CCACTGTAAA TTCTGGGTGT ATCCTCCACT GTATGCTGTC ACCCCAAGGG 120 

CAAGCACrcC ATCTGCTTAG TGAAGGATTT ATTGTTCGGA AGATACATTT TCCCCTTKAG 180 

25 

CAGAGAGTGG CGTATCCTGG CAGTCTTCGG TGAGCCAGTT GTACCAGGAT TATGAAATGC 240 

AGATGTTTAC TGTGTCATTG TTGCTGTCAT TGCTACTGAG GAGTACTCAC CAGAATCATC 300 

30 TGCAACTYTT AGTTGGCAGA GAGGACCACT ATGGCGGGTA GCTCTTTTCT TTCCTGCCAT 360 

TGTGGGGATG ATTCCAGGCC T^AAGATGATG GARAAGTATG GAAATCATCT GAAAGGTTGA 420 

AGCTTGGCAC CKSAAGCCAT TCATGACTTT GTAAGGCAGT TTTGCTGAAG GCCAGTTCTG 480 

35 

CCCTGGGAGG GACGGAGGTG AATCCTCCTG AGTACCTGTG GTTTTCTTAC TTCCTGCTGA 540 

ATTTACCTAA GTGCCTGTTG TTTGCTTGCT GTGGAGGCTT TCTGGTATTT CATTTCAGGT 600 

40 GCAGATGCCT TCACTTTCCC ACCRAAAAAA CCCCMACCAA ACCTAAGACC TTACTGCAAC 660 

TAAGTYTNCC AAGTACTTTT TAACCCAATG GGATGAACAG CCTGTGGTCT GCTCAGATCA 720 

CCCTGAGTGC GTGTGAGAAG GCMTNGGCTT TGCCAGGAAA TCCAGGAAGG CAGGGCCGGG 780 

45 

CTGTGTTGGA AGCTGGCTTA GCTGGTGGGG CAGCCTTATT TCAATTAAAA GGGCATTGAC 840 

TCGGAGCAGC AGTCCTGGAG TTrGTTGCAT TTCCTATTGC CCTCAAAATG AGAAACCAGG 900 

50 AAAATAGCAG ATTGGAGCCT TCGAGAAGGC AGTAAATGGC TGTTTTTATT GACAAAAGGA 960 

AAACATTTTA CTGCCATCTC ACTGATGGCA TCTCACTGAC TTAAAATGAA GGCANGTTGT 1020 

AGTAAAAAAA AAAGTCTACA TTTTTCCACC GCCACGTTCT TATATCCTGT TTGTCAGCCA 1080 

CTGCTCANAA GGGCATGTTG TCTTGCGGAN TANAGGCGCT CTCCTTCCCT CGTTTTCCCT 1140 

ATAGGTTGGG TG 1152 



55 



60 
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10 



20 



30 



50 



(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2483 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOIiOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 

AGCAGGCGGT GCGCTCGGGG CQGGAGCAGC GCGKAGCCCG GCTCGGCCAC ACCGATCGCC 60 

15 CGCCGCCATC GGCTCCTCGC AAAGCGTCGA GATCCCGGGC GGGGGCACCG AGGGCTACCA 120 

CGTTCTGCGG GTACAAGAAA ATTCCCCAGG ACACAGAGCT GGTTTGGAGC CTTTCTTTGA 180 

TnTATTGTT TCTATTAATG GTTCAAGATT AAATAAAGAC AATGACACTC TTAAGGATCT 240 

GCTGAAASCA AACGTTGAAA AGCCTGTAAA GATGCTTATC TATAGGAGCA AAACATTGGA 300 

ACTGCGAGAG ACCTCAGTCA CACCAAGTAA CCTGTGGGGC GGCCAGGGCT TATTGGGAGT 360 

25 GAGCATTCGT TTCTCCAGCT TTGATGGGGC AAATGAAAAT GTTTGGCATG TGCTGGAGGT 420 
GGAATCAAAT TCTCCTGCAG CACTGGCAGG TCTTAGACCA CACAGTGATT ATATAATTGG 



480 



AGCAGATACA GTCATGAATG AGTCTGAAGA TCTATTCAGC CTTATCGAAA CACATGAAGC 540 



600 



AAAACCATTG AAACTGTATG TGTACAACAC AGACACTGAT AACTGTCGAG AAGTGATTAT 

TACACCAAAT TCTGCATCGG GTGGAGAAGG CAGCCTAGGA TGTGGCATTG GATATGGTTA 660 

35 TTTGCATCGA ATACCTACAC GCCCATTTGA GGAAGGAAAG AAAATTTCTC TTCCAGGACA 720 

AATGGCTGGT ACACCTATTA CACCTCTTAA AGATGGGTTT ACAGAGGTCC AGCTGTCCTC 780 

AGTTAATCCC CCGTCTTTGT CACCACCAGG AACTACAGGA ATTGAACAGA GTCTGACTGG 840 

40 

ACTTTCTATT AGCTCAACTC CACCAGCTGT CAGTAGTGTT CTCAGTACAG GTGTACCAAC 900 

AGTACCGTTA TTGCCACCAC AAGTAAACCA GTCCCTCACT TCTGTGCCAC CAATGAATCC 960 

45 AGCTACTACA TTACCAGGTC TGATGCCTTT ACCAGCAGGA CTGCCCAACC TCCCCAACCT 1020 

CAACCTCAAC CICCCAGCAC CACACATCAT GCCAGGGGTT GGCTTACCAG AACTTGTAAA 1080 

CCCAGGTCTG CCACCTCTTC CTTCCATGCC TCCCCGAAAC TTACCTGGCA TTGCACCTCT 1140 

CCCCCTGCCA TCCGAGTTCC TCCCGTCATT CCCCTTGGTT CCAGAGAGCT CTTCTGCAGC 1200 

AAGCTCAGGA GAGCTGCTGT CTTCCCTCCC GCCCACCAGC AACGCACCCT CTGACCCTGC 1260 

55 CACAACTACT GCAAAGGCAG ACGCTGCCTC CTCACTCACT GTGGATGTGA CGCCCCCCAC 1320 

TGCCAAGGCC CCCACCACCG TTGAGGACAG AGTCGGCGAC TCCACCCCAG TCAGCGAGAA 1380 

GCCTOTTTCT GCGGCTGTGG ATGCCAATGC TTCTGAGTCA CCTTAACTTT GAACCATTCT 1440 

60 
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TTGGAATTGG CGTGGTATAT TTAACCACGG GAGCGTGTCT GGAAACGCAA ACTATCATTA 1500 

ATTTCATACT AGTTTGTACC GTATCTGTAG GCATCCTGTA AATAATTCCA AGGGGAAAAC 1560 

5 TAAACGAGGA CGTGGGTTGT ATCCTGCCAG GTTGAGTGGG GCTCACACGC TAGGGTGAGA 1620 

TGTCAGAAAG CGCTTGTATT TTAAACAACC AAAAAGAATT GTAAGGGTGG CTTGCTGCCA 1680 

GGCTTGCACT GCCGTTCCTG GGGGTGTGCA TCTTCGGGAA AGGTGGTGGC GGGGCGTCCA " 1740 

10 

CTAGGTTTCC TGTCCCCTGC TGCTCCTTCC GTAAGAAAAT GAAATATTCT ATGCCTAATA 1800 

CTCACACGCA ACATTTCTTG TACTTTGTAA GTCGTTTGCG AGAATGCAGA CCACCTCACT 1860 

15 AAACTGTAAA CGGTAAAGAG ATTTTTACTT TTGGTCTCCG TGAGTCGCAT CTCTACTAAG 1920 

GTTTACACAG GAATTCCACC TGAAGACTTG TGTTAAAGTT CTACAGCGCG CACTGTTAAC 1980 

TGAACGTCTT TTTCTTCAGC CTATACGCGG ATCCTTGTTT TGAGCTCTCA GAATCACTCA 2040 

20 

GACAACATTT TGTAACTGCT GCTGTTGCTT TCTACATACA CCTTATAAAG TGACATTTCA 2100 

AAAGAAATAA GGTGCCACAG TTTTAAACCA GAAGGTGGCA CTCTCTGGCT CCTTGTAGTA 2160 

25 TTATAGCTAT ACTGGGAAAG CATAGATACA GCAATAAAGT ACAGTAATTT TACTTTTTTT 2220 

CTTGTX3TTAC ATCTAAATTA CAACCCTTAA TTGCCACGTG TGCACTTACT ACTCTCCAGT 2280 

ATGTCTTATT ACTCTCCAGT ATGTCACGCA TCTTTAACTT TTCACGTCCT ATGTTTGCTT 2340 

30 

TCTCCCATTT TTAAGAGATG GTAAGTTAAC TGGAATTGAT TTACTGAATG AAATTAAATG 2400 

CAGATATCCC TGTTTTTGAA ATAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 2460 

35 AAAAAAAAAA AAAAAAAAAA AAA 2483 

40 (2) INFOFMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 536 base pairs 

(B) TYPE: nucleic acid 
45 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

50 GAGAAATGGA GCTTTGTTAG ATAAAAATTT TTTCAACGCA AACAGTCATT TTCCAGTGAA 60 

AGGAGAGCGT ATCCGCCGTA GGATGGACTT AGATCGTGTA AAAGCTGAGG CCACCGAGGA 120 

TATAACCTCC GGGGTCCTTT GCCTCCTTTT CCTTAGACTC CCTCCAAACT CGTGTATCTT 180 

55 

TCCTTCAGCA GTACTCGGCT CCACGCGAAC CTAGTCCTTT GTCTTTACCC TATTACCTTT 240 

CATAACATCC TAGTTGAAAA GTARTTATTC AACCGCGTTT GAAAATGAGA ACAGGTTCAC 300 

60 AGARGCTAGG TTACTTGCGA AGGTCGTTCA ATTAGTAACC AGTAACGCCA GGACTGCCAG 360 
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TTTCTTGCTT CCGAATTCTC ATGGTAGCTT TCACCARGCT CCCCGTCMAA TGCTAACGTC 420 

7ACTACTGAA CTAGATTAGC AAAAAGGTCT TITAACAGAA TTCCTGGTTT TCAGAGAGAG 480 

5 

TTTCTTTCAT GAAGCGCCCC ATTTCTACAG AGGAAAATAA ACTCCAAGCA GCCAGT 536 



10 

(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 865 base pairs 
15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

20 

CCACGCGTCC GGCCTTTCTT GGCCAGAGGC GCCGGTTGGA CTCACGGGCG GGGCATGATG 60 

GGTAACAGGA CCGGTGGGGT CCCCAGGAAG TCCTAGAGGG GGTCGGGGTT TGGGTGGACA 120 

25 AGCTTTCCTC GTCCTCTCCC GACAGAGCTG ACGTGTCCTG GGTTCCACCG GGAGCGGGCA 180 

TTTCCACCGG ACGGGAGGGT TCGGGGTGTC CGGGGCTGGG GAATACGTAG GGGTTGCCGC 240 

GCGGTGTGGG GAGTTGGGGC GTGTGGCTGC AGTCCCGGGA GTTCTTGGAG GGGGTCGGCC 300 

30 

CACCGAGCTT CCGGACCGGC TGATCTGCCC GTAGCTTGCC GGANGGARGG CGGAGCTGAC 360 

TCTCCGTCCC TTCTCCCATC CCCTCCAGTG GTGGGTACGG GCACCTCGCT GGCGCTCTCC 420 

35 TCCCTCCTGT CCCTGCTGCT CnTGCTGGG ATGCAGATGT ACAGCCGTCA GCTGGCCTCC 480 

ACCGAGTGGC TCACCATCCA GGGCGGCCTG CTTGGTTCGG GTCTCTTCGT GTTCTCGCTC 540 

ACTGCCTTCA ATAATCTGGA GAATCTTGTC TTTGGCAAAG GATTCCAAGC AAAGATCTTC 600 

40 

CCTGAGATTC TCCTGTGCCT CCTGTTGGCT CTCTTTGCAT CTGGCCTCAT CCACCGAGTC 650 

TGTGTCACCA CCTGCTTCAT CTTCTCCATG GTTGGTCTGT ACTACATCAA CAAGATCTCC 720 

45 TCCACCCTGT ACCAGGCAGC AGCTCCAGTC CTCACACCAG CCAAGGTCAC AGGCAAGAGC 780 

AAGAAGAGAA ACTGACCCTG AATGTTCAAT AAAGTTGATT CTTTGTAAAA AAAAAAAAAA 840 

AAAAAAAAAA AAAAAAAAAA AAAAA 865 

50 



(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 932 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 71: 

TCATCATATA CAAAGTTTTT CGTCACACTG CAGGGTTGAA ACCAGAAGTT AGTTGCTTTG 60 

5 

AGAACATAAG GTCTTGTGCA AGAGGAGCCC TCGCTCTTCT GTTCCTTCTC GGCACCACCT 120 

GGATCTTTGG GGTTCTCCAT GTTGTGCACG CATCAGTGGT TACAGCTTAC CTCTTCACAG 180 

10 TCAGCAATGC TTTCCAGGGG ATGTTCATTT TTTTATTCCT GTGTGTTrTA TCTAGAAAGA 240 

TTCAAGAAGA ATATTACAGA TTGTTCAAAA ATGTCCCCTG ITGTTTTGGA TGTTTAAGGT 300 

AAACATAGAG AATGCIGGAT AATTACAACT GCACAAAAAT AAAAATTCCA AGCTGTGGAT 360 

15 

GACCAATGTA TAAAAATGAC TCATCAAATT ATCCAAITAT TAACTACTAG ACAAAAAGTA 420 

TTTTAAATCA GTTTTTCTGT TTATGCTATA GGAACTGTAG ATAATAAGGT AAAATTATGT 480 

20 ATCATATAGA TATACTATGT TTTTCTATGT GAAATAGTTC TGTCAAAAAT AGTATTGCAG 540 

ATATTTGGAA AGTAATTCGT TTCTCAGGAG TGATATCACT GCACCCAAGG AAAGATTTTC 600 

TTTCTAACAC GAGAAGTATA TGAATGTCCT GAAGGAAACC ACTGGCTTGA TATTTCTGTG 660 

ACTCGTGTTG CCTTTGAAAC TAGTCCCCTA CCACCTCGGT AATGAGCTCC ATTACAGAAA 720 

GTGGAACATA AGAGAATCAA GGGGCAGAAT ATCAAACAGT GAAAAGGGAA TGATAAGATG 780 

30 TATTTTGAAT GAACTGTTTT TTCTGTAGAC TAGCTGAGAA ATTGTTGACA TAAAATAAAG 840 

AATTGAAGAA ACACATTTTA CCATTTAAAA AAAAAAAAAA ACTKGAGGGG GGCCCGGTAC 900 

CCAAATCGCC GCATAGTGAT CGTAAACAAT CT 932 

35 



25 



40 



50 



(2) INFORMATION FOR SEQ ID NO: 72: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 996 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : doiible 
45 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

CGCCTCGCAC CATGAGGACG CCTGGGCCTC TGCCTGTGCT GCTGCTGCTC CTGGCGGGAG 60 

CCCCCGCCGC GCGGCCCACT CCCCCGACCT GCTACTCCCG CATGCGGGCC CTGAGCCAGG 120 

AGATCACCCG CGACTTCAAC CTCCTGCAGG TCTCGGAGCC CTCGGAGCCA TGTGTGAGAT 180 

55 ACCTGCCCAG GCTGTACCTC GACATACACA ATTACTGTGT GCTGGACAAG CTGCGGGACT 240 

TTOTGGCCTC GCCCCCGTCT TGGAAAGTGG CCCAGGTAGA TTCCTTGAAG GACAAAGCAC 300 

GGAAGCTGTA CACCATCATG AACTCGTTCT GCAGGAGAGA TTTGGTATTC CTGTTGGATG 360 

60 
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ACTCCAATGC CTTGGAATAC CCAATCCCAG TGACTACGGT CCTGCCAGAT CGTCAGCGCT 
AAGGGAACTG AGACCAGAGA AAGAACCCAA GAGAACTAAA GTTATGTCAG CTACCCAGAC 
rrAATQGGCC AGAGCCATGA CCCTCACAGG TCTTGTGTTA GTTGTATCTG AAACTGTTAT 
GTATCTCTCT ACCTTCTGGA AAACAGGGCT GGTATTCCTA CCCNGGAACC TCCTTTGAGC 
ATAGAGTTAG CAACCATGCT TCTCATTCCC TTGACTCATG TCTTGCCAGG ATGGTTAGAT 
ACACAGCATG TTGATTTGGT CACCTAAAAA GAAGAAAAGG ACTAACAAGC TTCACTTTTA 
TGAACAACTA TTTTGAGAAC ATGCACAATA GTATGTTTTT ATTACTGGTT TAATGGAGTA 
ATCGTACTTT TATTCTTTCT TGATAGAAAC CTGCTTACAT TTAACCAAGC TTCTATTATG 
CCTTTTTCTA ACACAGACTT TCTTCACTGT CTTTCATTTA AAAAGAAATT AATGCTCTTA 
AGATATATAT TTTAYGTAGT GCTGACAGGA CCCACTCTTT CATTGAAAGG TGATGAAAAT 
CAAATAAAGA ATCTCTTCAC ATGARAAAAA AAAAAA 

(2) INFORMATION FOR SEQ ID NO: 73: 

{i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 785 base paxzs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 
GGCACGAGGG GCTTTGCGTA CACAATAGCT GCTAGGAGTA CCCAAAGCCT GARTACARCC 
TGCTQGTGTC ATGGCCACGT GTGAGCAGGC CAGCGTCAMA CGGCTCGCTG TGACCCGTCC 
CGRAGACTGA AATGGGCCTG GGTCTTCTCC TKGTCCTGTG ATWAAAGTCC TCTCTTGAAA 
GTCGAGAGCA AAGGCACACA GAGGTGCGCG CTCACAAGAA TTCCTCCCGG TGACTGGGTA 
ATCAATGTTA CTCCTGTTTC CTTTGCAGGA AAGACCACAG CAAGATTCTT TCATTCGTCT 
CCTCCTAGCC TGGGGGACCA GGCTCGAACT GACCCTGGAC ATCAAAGGAG GGATTATGTG 
GCTGCTAAAG CCATCGGCCC ACAGCCCTGT TCACRTCTTG GTGCTTCTCT TTCCCAGAGG 
CTCGTCCCAG CCAGGCACAC ACAAAAGGCA GATTCTCGTA AACSCAGCCT CCCTCCCTGG 
AGGCTCCCTC CTGCCCIX3GA TCTGGAGTGG AGCTGCTCTG AGATTTTGAG TTCTTCTGCA 
GAGA'TCATTA AATATATCCA AGAGACATTG GAAAACCTGC TGAACATTTT ACATTGGTCT 
GCTCAGCACA TGGCTGGATG CGGATATTTC TATAATTCCA GAAAGTCACA CAGCTCCTCT 
GTATGAGACC AGTGGGCGCC ATTTAAAAGA ACAGGATGAG AATCTAAGAT ATATTATTAA 
TAAATGTAAT GGATTnTTT TTTGTAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
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AAAAA 



785 



10 



15 



20 



25 



30 



35 



40 
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(2) INFOFMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1069 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 

TCCTCACCAT TCCCCTAGGN CAGGTCCCTG CAGGTCCCAC ACTTCTCCCA C3GTCCCTAAA 60 

CTTGGGTCGG TCCTTTCCCT GGAGTAGCTG GNTCCTCCAG TCGAGGTCCC TGTTCAGTCG 120 

GTTCTTAGGC TCCTGCACAT GAAGGTGTGT GCCTGTGGTG TGTGGGCTGC TCTAGGAGCA 180 

GATACAGGCT GGTATAGAGG ATGCAGAAAG GTAGGGCAGT ATGTTTAAGT CCAGACTTGG 240 

CACATGGCTA GGGATACTGC TCACTAGCTG TGGAGGTCCT CAGGAGTGGA GAGAATGAGT 300 

AGGAGGGCAG AAGCTTCCAT TTTTGTCCTT CCTAAGACCC TGTTATTTGT GTTATTTCCT 360 

GCCTTTCCGA GTCCTGCAGT GGGCTGCCCT GTACCCTGAA CCTCATGAGC CTCTAAGGGA 420 

AAGGAGGAAC AATTAGGACG TGGCAATGAG ACCTGGCAGG GCAGARTACA AGCCCAGCAC 480 

CAGTGTCCCA GCCTTACTGG GTCCTTACCC TGGGCCAAAC AGGGAGGGCT GATACCTCCT 540 

TGCTCTTCCT AGATGCCCAC CTCCTACAAT CTCAGCCCAC AAGTCCTCTC CACCCTAGGG 600 

GGCTTGCTGC A1GGCAATAA CTCATAATCT GATTTGGAGG TTTGCCCTTT ACAGGGGCAG 660 

ATTTTCTGCT CAGTTCAACA ATGAAATGAA GAGGAACTCC CTCTTTCTAC AGCTCACTTC 720 

TATCAGAGGC CCAGGTGCCT CAGAGCCACA TTGAGTTGCT TTTTCTGGGA TGAGGAAGTA 780 

GGGTTAAACT CCCCAGTTTC CTGAGGGAGG CTCCTGACAG GTGCCCTTTG TCAGACCCTA 840 

CCACAGCCTG GATAGGCAGC CACATTGGTC CTCGCCCTTG CTCGGNACTC CGTGGTGGTC 900 

CTGCCCTTCT CCCTGCATGC CTGTGGGTCT GCTCTGGTGT GTGAAGGTCG GTGGGTTAAC 960 

TGTGTGCCTA CTGAACCTGG CAAATAAACA TCACCCTGCA AAGCCAAAAA AAAAAAAAAA 1020 

AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAA 1069 



55 



60 



(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 831 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 

5 

GGACATTAGA TCACTGTGGA CCTAAAACAA ACAAACAACT ATAAGGAAAA TGGCATTAGA 60 

AATGGTCTGG GGATCAGTTT ATCACTGCAG TTGTTACATC ACCCCATGGT CTAAAATACA 120 

10 GAGCTTTAGT CTGTCTCTGT TTCAGTTCAT TTTACAGGAG GTGAACATCA CACTTCCAGA 180 

AAACTCTGTC TGGTATGAAA GGTATAAATT TGATATTCCT GTCTTTCACT TGAATGGCCA 240 

GTTTCTGATG ATGCATCGAG TAAACACCTC AAAACTTGAA AAACAGCTCC TGAAACTTGA 300 

15 

GCAGCAAAGT ACTGGARGCT GACTGATGCC CTCATGATTT TCCACCCTCT CTTCCCATAA 360 

AGCATCTTCC TAAGGAAATG AMCATGGCCT GATACTCATT TTGTCACTTG TACAGAGCCC 420 

20 TAAGGATGTT CTGAATTCAG TGGTGCCAAA TAAATGTTGA CATTCCCCTT TTGGTTGATG 480 

GAAGTATCAG TGTGGGAACT GTTTGCTTAA TGGCArTTTA TAAAATAAKA AKAKCATATT 540 

AGCAGGGAGG GAGATGATGG AGGGAGGGAG AAGTCCATTT GTCTTATTTA TCCTTTTTGT 600 

25 

ATTAATAGAG AAGCACTTCA CAGTCACTGG CAATGCCATT TATAGGAAGA AGGTTCTGCA 660 

TTCCTGCTGC TCCCGGAGGG CTTAACTTTT TAATGAAAGA ATAAATGCTC TTCCACTCAG 720 

30 TAGATAAAGT GAAATG1GAA TTGTTAATAA CTGTGCACGG TCAATAAAGC GATGTTTTAA 780 

GGAATACAAA AAAAAAAAAA 7VAAAAAAAAA AAAAAAAAAA AAAAAACTCG A 831 



35 

(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 590 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEOSrESS : double 

(D) TOPOLOGY: linear 

45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

TATATATAGA CNGTITAATAG TCGTGANTGN TGTGNACGAA CATTAACGGA AGTAGCATGT 60 

AGCCAGTCGA ATAACNTATA AGGACAAAGT QGAGTCCACG CGTGCGGCCG TCTAGACTAG 120 

50 

TGGATCCCCC GGCTGCAGGA TTCGGCACGA GCTGCCAGGT GAGGAGCAGA GAGACTGTTC 180 

CCTTGGGTGG AGAGGTGTGG GCATGAGAGC CACCCATTGC CAAGCAGCAA GAATGTTCGT 240 

55 GCTTTTTTCC CTTCCAAAAT ATGCAGGGCT CAGGCTCCCA ATTCCGGGCC TGTCTGCnT 300 

GCTTCTGTTT CTCCTGTCCC TGTTCTCCCG GAGGGCCCAG GTGGAACTCA CGACAGGGAG 360 

GGAGACGCTT CCCAAAAACC TGCAGGGCTA 1TTCCCAGAA TTTGGTTTTC AAGTACAAAA 420 

60 
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332 



CmTTGTCC TGTAAGATAT ATGCAGCCTC ACAGAAGCAG 



CCTCTGCCTC 



CACTTTACCA 



480 



GCTACGTTTT TATCTTAAGC ACATGGGGCT CCCTTAGAAC 



TTACTCCACT 



GATTTAAAAA 



540 



AAAAAAAAAA AAACTCGAGG GGGGGCCCGG TACCCATTCG 



CCCTAAAAGT 



590 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1274 base pairs 
{B) TYPE: nucleic acid 

(C) STRANDECNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 

GAGCCACCAC ACCTGGCCTG GAAGGAACCT CTTAAAATCA GTTTACGTCT TGTAmTGT 60 

TCTGTGATGG AGGACACTGG AGAGAGTTGC TATTCCAGTC AATCATGTCG AGTCACTGGA 120 

CTCTGAAAAT CCTATTGGTT CCTTTAITTT ArrTGAGTTT AGAGTTCCCT TCTGGGTTTG 180 

TATTATGTCT GGCAAATGAC CTGGGTTATC ACTTTTCCTC CAGGGTTAGA TCATAGATCT 240 

TGGAAACTCC TTAGAGAGCA TTTTGCTCCT ACCAAGGATC AGATACTGGA GCCCCACATA 300 

ATAGATTTCA TTTCACTCTA GCCTACATAG AGCTTTCTGT TGCTGTCTCT TGCCATGCAC 360 

TTGTGCGGTG ATTACACACT TGACAGTACC AGGAGACAAA TGACTTACAG ATCCCCCGAC 420 

ATGCCTCTTC CCCTTGGCAA GCTCAGTTGC CCTGATAGTA GCATGTTTCT GTTTCTGATG 480 

TACCTTTTTT CTCTTCTTCT TTGCATCAGC CAATTCCCAG AATTTCCCCA GGCAATTTGT 540 

AGAGGACCTT TTTGGGGTCC TAT^TCAGCC ATGTCCTCAA AGCTTTTAAA CCTCCTTGCT 600 

CTCCTACAAT ATTCAGTACA TGACCACTGT CATCCTAGAA GGCTTCTGAA AAGAGGGGCA 660 

AGAGCCACTC TGCGCCACAA AGGTTGGGGT CCATCTTCTC TCCGAGGTTG TGAAAGTTTT 720 

CAAATTGTAC TAATAGGSTG GGGCCCTGAC TTGGCTGTGG GCTTTGGGAG GGGTAAGCTG 780 

CTTTCTAGAT CTCTCCCAGT GAGGCATGGA GGTGTTTCTG AATTTrGTCT ACCTCACAGG 840 

GATGTTGTGA GGCTTGAAAA GGTCAAAAAA TGATGGCCCC TTGAGCTCTT TGTAAGAAAG 900 

GTAGATGAAA TATCGGATGT AATCTGAAAA AAAGATAAAA TGTGACTTCC CCTGCTCTGT 960 

GCAGCAGTCG GGCTGGATGC TCTGTGGCCT ITCTTGGGTC CTCATGCCAC CCCACAGCTC 1020 

CCAGGAACCT TGAAGCCAAT CTGGGGGACT TTCAGATGTT TGACAAAGAG GTACCAGGCA 1080 

AACTTCCTCC TACACATGCC CTGAATGAAT TGCTAAATTT CAAAGGAAAT GGACCCTGCT 1140 

TTTAAGGATG TACAAAAGTA TGTCTGCATC GATGTCTGTA CTGTAAATTT CTAATTTATC 1200 

ACTGTACAAA GAAAACCCCT TGCTATTTAA TTTTGTATTA AAGGAAAATA AAGTITTGTT 1260 
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TGTTAAAAAA AAAA 

(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1133 base pairs 

(B) TYPE: nucleic acid 
iC) STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 

AGGATTTTTC CTTGTTCAAC CAAAATCTGA GCATTCTTTC TATGTTGAAA ACACTGAAAA 

ACTAATTTWA GTTAATGAAC TAGAAAGAAT ATTGATTTTW AAGAAACAGA AAAATACTAC 

TTATTTTCCT TCTCAAATAA CGTTTCTTTC AAAAACTTCT GGCTGAAGTA TAACATGCTG 

GTAGTTAACA TAAATCTTGT CTTTCTCTTG TTCTTTATCT TTCTTTGTTA TTTAGATGCT 

TGTATAAATG TCTTTTGTTT TTATTAAGTG CCTAATTGAC AGAGCTTAAT TTGAAGAAGT 

GCCCTAATTT ATTGACCACT TAAGAATTGC CTTTATTGGG GTATTTTATT TGTTCCTGCG 

TCTTTTTGAT GTTGTTCAGT CTACTCATCC CTGTGAGTAT GTGTGGGGGA CAGCTGATAG 

AAGGGAGGAG AGTGTGTCTA TGCTCAQGAT TGCCCTTTAG CCACTCAGCC AGAGATCCAC 

AGGGAGCAAC AAGGACAGTT TCACATGCTT AGACTTTCTT GGAAGAAACA GTGAGGAGGA 

GTAAGTCGTG AGTAGTGTCA AGCTGGATGT AGAATTGTCC TAAGGCAGTT GACCCCACCT 

TCCAACATGT TTTCACTTTA TTTGCCCCTC CCTACATTTG GGTTAGGTTC CATTTGGATT 

TGCAGCAATA ATGACTTTAT TTCTCTCTTG GTCAGGATTT GGCACATAAA ATCCTTTTAT 

TATAGAACTA GCTATTTTAG TTACATAGTA ATGTAACTAA TGGAGAGATT TATAGAGAAT 

TTTGKTnTG CTGTCATATA TGTCCATTTT GGAGACAGAT ATGATAGAAC TAGAAATTAA 

GTTGCATTTC TGCAAGTGCC ATTTGAATGA ACTTCAAGTA TCTTCTTAAT TATTAAATTT 

TCTGATGAAG GCATTGTAAC AAATATATAG TATTATTAAA TCTAATTAAT ATTTGGAAAT 

ATTAATAAAT AGGTATTTTA ITTACTGTAA AAAGTCAAAC TTCATTATGT AGATAAATCT 

TATTCTTTTC ATTCTTTCCC CTGTTTACAT CCTTTTTACA AAGCTTAGTC ACCAATTAAA 

GCTTTCCTAT CAAAAAAAAA AAAAAAAAAA ACTCGAGACT AGTTCTCTCT CCT 



(2) INFORMATi™ FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 661 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 
GAATTCGGCA CGAGGGGAAA AGGATGCTGA ACGAGAGCAG AAAGCCTCTT TCCTTTGCIT 60 
10 CACGCCTTTC CAGTCTTTAT TTTAAACTCG GGTTCCCTTT CTGTGGTCGC AGCAACCTTT 120 
ACTCCACCTG CACTGCTGCT CCTGGGGGCT CCCCAGGCCT CCCTCTGCCT TTCTACCCAG 180 
TGGCTOACGG GATGCCTGTC TTGCCTGGAC GCACCACTGC TCTCCTGTCC CTCACCTTGG 240 

15 

CTTTTGCTGT GCCCTGCTCT GGGGTTGAAG CTGGCCCATG TGTCCCCCGG AGTCATGGCT 300 
GCTCCTCCTG GGAGGCCTCT GTGTGCGrTCA CGTCTTCCAC ACCTGGGGGC AGCTGGCGAG 360 
20 CCCGTGCTCT GTTCCCCTCG GCTGCTTGGC ACAGAGYTGC AGCCTGGGAY TCTCCGTGGA 420 
CCCAGACTGG GGATTTTGCC AGGGGGGCGA TGGGAGGAGC AGGTGCTTTG CCTGGCGGCT 480 
GTGTCTGCAT TTCTGGACGC CCCAGAGCAC AGAAGTTGCC GGCACTTTGA GGTCTTCCTC 540 

25 

GGCATGTGCC AGATTACATG AGTGACGGCT GGGAATATGT TTTCTTTTTT GTAATGGAGG 600 
CGTGTTTCAC ATATAGTAAA GCTCACCAAA AAGTAAAAAA AAAAAAAAAA AAAAAACTCG 660 
30 A 661 



35 (2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH:' 1378 base pairs 

(B) TYPE: nucleic acid 
40 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

45 ATTGGGTACC GGGCCCCCCC TCGAAGTTTT TTTTTTTTTT TTTTAATGAA AGCTCTCAAA 60 

TAAGCGATTT TATTCCTATC CATGATTGCA GACATTTACA AAACCATAAC ATCTGAGTTC 120 

ACCTTAAAAA ATAACTTATA TAAAGCAGTG ATATACACAG CACAAAATAG TTCAGGGAGG 180 

50 

GQGCAGGAGC AACTTGTAAT AATTAAAATG TAAACGTGAA AAAAAGGATG GAATAAAAGT 240 

CCCTACTTAT TTCTACTTAA GATGTCATGT GATAATATTT TACAATGTCC TGTGGGTCAA 300 

55 TGTATCTATG TGTATATGTC TGTATAACAT ACACATATAC AGTACATTCT CITTCCCACA 360 

CATATACATA CACACATAAT TATTTGCAGT TCAGTTTAGG GCAATTCTAA TATGCCACTC 420 

CGTACAGTTG TTTGAATCAC ATTTGGACCC GCTTTCTTCA CAAAAGAGGG GAGAGAGCAG 480 
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GAAATAAAAA GGTTGGTTTG GTGTGACTGA GATTCCTTTG TTTAACTGTA CACTGTGATG 540 

AATAATTTTC TTCCGTAGTA GTTCTGTGAA GGGCTGACTC ACTGTGGTTT TCATGAGGAG 600 

5 ACTTGGTAAT GGATCACACG CTCATTGTCA TGCTAGGGGA GTAACTCTCA CTCTGAAAAG 660 

GATTTAAGAA ATTTCCCCCC ATTTCGCCAT CATCCCTTGG AGTGCCCGGT TGATTACTCA 720 

GGCTCATATT ATTGGGAGAA TTCTTGGAAA TACTGTCCAT ATCTCCTGAG CCTAAAGAGC 780 

10 

CATTCATGTC ATGTGACTCC ATTCCTCCTA ATCCACCCAT GGGACCATCT GACCCAGGRC 840 

CCATTGGAAA ATTAGGTCTG TTAGGTCCAG GAGGTACTGC ATTCATTAAA GTATACATGT 900 

15 TATCACCAGA GTTGGTTGAA TCTGCTGGAC TAGGCATGAT GGGTGTTCCT GGTGGCCCTC 960 

CACCTCCTGG AGGACCTACA TAATTCCCAG GAGATGCTGA GGAGTATGGT ATTGAATTGG 1020 

CATTTGTTGG GTTTGGCCAA GGTCTACCAC CACCTGGACC CATGTTCATT CCAGGCATTC 1080 

20 

CAGGGCCACC TAAAGCATTC AGTGGGGGTC TCATTGCACC TCCATAGTTC TGTGGTCCTA 1140 

AGGGCACCAT TCCTCTTGGA GGAGTCATTC TCTGCATTGG CCCACCCATA TTTGGATGTC 1200 

25 CTTGTTGTCG AGTTGGATCC ATTCCACTGG GGAGTAATGG CTGACTTCCT GGGACACCTC 1260 

CAAGTGCCTG ATTAGGTATC CTCAATGGGG GCCTTGGACC TCCAGGGTAC CGAGGTGACA 1320 

TAAAAGGGTA ATCATGGAAG GCTTTTGCTT CACTTGAGTG TTCACATGTT TCACX3TCT 1378 

30 



(2) INFORMATION FOR SEQ ID NO: 81: 

35 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1440 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECNESS 1 double 
40 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 

ACTTTGTCCA AATGTGTCTG TCACATGTAG TCAGCTGNAG NAATTTAAAA TGAATTGCCA 60 

45 

AGTGAAGAGT CTGTGGATTA ATTGGCCGTT AATTAACAGG CTTTATCAAT GTGTCCTCAA 120 

GGGAGAGGCC CAACCCTAAT TAAGGAGCTA AACTTCCTGA GTGAGGGGCT GTGAGGATGG 180 

50 AGGTGGAGGA GGCATCTGGG GCGGGTGGTG GCCGGGCCAG CAGATGGCGC CTCCCTGGCT 240 

GAGCTGCCCG CACCGCCAGT TCCCTCATTT CCACTCAGGA AGGCAGAGAA GGCAGAGTGA 300 

TCTCCTCAAG GAAGAGCTTC CCCAGCCTTC GGGAGCAGCT GGCAGGGCCT CCGQGAATAA 360 

55 

GCCCTACACG CCGCCGCCTG CCTCCAACTC ACTAACCCTG CGCCTCTTGT CTTTCAGATT 420 

CAACGCGTTC AACAGAAGCC ATCCCCAGCC CAGCTTAAAT TATAAAGATA GACAATAACT 480 

60 CTGTTCCAAT CTCCGTGGTG CTTCTTTAGT AAATACTGTA CAGATTTTAC CATGGAGAAC 540 
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TTTTTTTTTA GTTTTTACCT TTTCTTAATT ACCCTTATTC CGAATGGACG AACACTTTCT 600 

ACCACTGCTG ACCATTGTAA AATACCGTGT ATATAAATCC CATTGAAATA ATGCCCTGGA 660 

5 

ATAGAACATC TCAAATGCTG CTTAATTACA GACTCAGGTC GATTACTTGT ATTTCATGTA 720 

ATCTTCCTCC AAGTTAGACA TCTGGTGCAA GACCAACCGG GAGACCATGG AATTGTCAAA 780 

10 AGTACAAACT GACAGTGTGT ATATTTAATT TAAAGACTTA TTTAAAAACT CACAAGCTCT 840 

CACCTAGACT TTGGAGAGCA GTCTGTTTTC TGTAATGTCT GATACTAGAA ACTAATTTGC 900 

TTATTTTAGT TGTATTCAAG ATTTGAAGAT GTATTTTATA GACAAGTTCT GrTTTTGAAC 960 

15 

TTTGTGGAAC TGTTCCAATC AATCAATTTC CCAGTTATGA TGAGTATTTA CATTATGAAT 1020 

GTATAACCCA GACATGATTT GTAAAGCCGA CAGTATGTTT CTATTACACA ACACTTTTTG 1080 

20 ATACAGCGTC TCTTGTCTTC ACTGATACTG GAGTCTCCGT TGTCTGCtING GTCCCTTCGA 1140 

GTITCTAGTT ACAGACACAA TCATACTGTG ATTTTATTTT TAATATGGAT ATGCTATCAA 1200 

ACTGTGATAC ACTTATAATT CACTGGTCCT GCATCAGGAG ATGGAGTGGG GAAAACTGTA 1260 

25 

TTTAATACAG TTTGTATCTG AATAATCTGT ATGGTTTATA GAGTTTGTGT TGTTCAGAGA 1320 

TGTTTAAAGT TTGATCTTTG TTTTTCTAAA GATTAAAAAA GCACTTGCCC CACTGTAAAT 1380 

30 ATACAGCATG TAAAATTTCT RTAGTATATA AATGGCAGCA AATCACAAAA AAAAAAAAAN 1440 

35 (2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1381 base pairs 

(B) TYPE: nucleic acid 
40 iC) STRANDEENESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 

45 CCCGGGCTGC AGGAATTCGK YACGAGGCCA GCAGTTGCTC CCAGriTCAGG AGGTGCTCCT 60 

GTACCCTGGC CACAGCCCAA TCCTGCCACT GCTGACATCT GGGGAGACTT TACCAAATCT 120 

ACAGGATCAA CTTCCAGCCA GACCCAGCCA GGCACAGGCT GGGTCCAGTT CTGACCTGAG 180 

50 

CACGGTTTTT CCTCATGTGA CTTCTGGGAA GGCGCTCCCT CATCTGGGCC AAAGGAAGGA 240 

GGACGAAGCC CTCCTCAGCT GGCCTGTGTT TGGGGCATGA ATCTCTCCTC TCCTCCTTGT 300 

55 CTGGCTCTGT TGACAAACCG GGCATGTTTG GCAGTAAATT GGCACCGTGT CACACTGTTT 360 

CCTGGGATTC AAGTATGCAA CCAGAACACA GGAGAAGAAA AGCTCCAGGA TCCCTGTCCC 420 

CATCTGTCCT CTTGATGTGA GAGAGACTCT GAGACTTCTT CCATCGCAAT GACCTGTATT 480 

60 
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10 



15 



20 



25 



30 



AAACACAAGC CCCCCAAGCA AAAGAAGAGG TTGAGTTTGC TGCCAGGATT CAGATCAGCC 540 

CTTCCCAGGG TCTGCAGGTG TCACATGATC ACAGTTCAGC GGGAGGCTTT CCGTACCCAC 600 

ACTGGCTGTA GCACTTCAGT CCATCTGCCC TCCAGAGGAG GGTTTCTTCC TGATTTTTAG 660 

CAGGTTTAGA GGCTGCAGCT TGAGCTACAA TCAGGAGGGA AATTGGAAGG ATTAGCAGCT 720 
TTTAAAAATG TTTAAATATT TTGCTTTGCT AATGTGCTGA TCCGCACTAA CTCATCTTTG - 780 

CAAAAGGAAC TGCTCCCTCG GCGTGCCCCA GCTGGGGCCT CTGAAGGGAT TCCTCACTGT 840 

GGGCAGCTGC CCTGAGCTTC AGGCAGCAGT GTTCATCTCT GGCCAGTTGT CTGGTTTCCA 900 

TGTATTCTAG GCCAGGTAGG CAACACAGAG CCAAGGCGGG TGCTGGAAGC CAGACGGAAC 960 

AGTGTTGGGG CAGGAAGGTG GATGCTGTTG TCATGGAGCT GTGGGAGTTG GCACTCTGTC 1020 

TGCTGGTGGC CCTCTCGGCT CACATG1TCA CAGTGCAGCT CCTGGCAGAC TrGGGTTTTC 1080 

TCTTTGGTGG TTTCTAAAGT GCCTTATCTG CAAACAACTT CTTTTCTCCT TCAGGAACTG 1140 

TGAATGGCTA GAAGAAGGAG CTCAGTAAAC TAGAAGTCCA GGGTTGCTTG GTTTACTGGT 1200 

TTATAAGAAA TCTGAAAGCA CCTCTGACAT TCCTTTTATT AACTCACCTC TCAGTTGAAA 1260 

GATTTCTTCT TTGAAAGGTC AAGACCGTGA ACTGAAAAAA GTGTTGGCCT TTTTGCGGGA 1320 

CCAGATTTTT AAGATAAAAT AAATATTTTT ACTTCTGTCA AAAAAAAAAA AAAAAAATNT 1380 

C 1381 



35 

(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1706 base pairs 
40 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 

45 

ACTGCACCAC TGCCCAGGTC TCCCGGCTGG ATGAAGACGT GGTCCATGAG GAAGCTGGCT 60 

AGCTCAGACT GGAGAGTAGC TTCAGGAAAA AAGACAAGTG GCCTAAGGAA ATCACGGCCC 120 

50 CCAACTATCA TCTGAGGGCT AAAGATGAGA AGTAGATCAC TTAATAAGAC AAAAGCCTGT 180 

AGGGGGAAAA GAAAGGATGT TTAAAAGGAC AGAATGTTTC CCAAGGTAGA AATGACACTG 240 

TCAATTTCTC CTTGGAATGG GGGCAGGGAT ACTCGCCTTG TTGCTOCCAC TTGAGTCAGT 300 

55 

ACTCACCTGC TCCTGGATCT CAGTATCCAC ATCTGAGAGG CAACTCTGGC AGAGTTCACA 360 

GAAGGCCACC ATTCTGTCCC TCAAACTCGA CAGCTGCTTC TGTGGGCACA GTGGCTTGAA 420 

60 GGGGAAGAAT GAAGACACAG ACTCCTCTGT TCCCATTATC CCATCTAAGA CCCACACTCA 480 
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CCTGGGGAAG CATCTGATTT AGAAATGTGG GTTAGTGTCC AGAGAATGGA AAAATAGACA 540 

AGAGTCAAGG CTGGCAGGAT AACCTGTAAC AACAAAGGGT TTGAAAAATG AGGTTTGGGT 600 

5 

TAGGAGAGGG AGAGACAGAT AGCCAGAAAC ACACCAGTGA AGAGGAGAGA AAATGAGTAA 660 

AGGGAGAGCT AATTCCTTTT CCAGTGGAAA ATGAGTGATA TTCTGGACAT TCTTCAGAGG 720 

10 CATCTACACG AAGTAGAAAT GTCACCGCTC CCTAATTTAC TCTACGTCTT CTAGAATCCC 780 

TCAATATTAT CCTTGGCTTC CAGGAAATCC AAGAAGACCC TGGAAGTAGA GTCCACCTTC 840 

TAAGAGAGGA ATGTAAGAGG TGACCCCCAC CCACCTGATC TTCCTCGCTT TGTCCACTCC 900 

15 

ACGCACTGAG ACTTGACACA CCTAGTGGCC ACCTAGAACG TAGGTCCTTA AAATYTAGCC 960 

CCCCAGCCCC CAACCCATCT CTAGCCTGTC CACTCACCTG GTGAGGAACY TYTCCTGTGT 1020 

20 CCACAGCYTT CTGCAGGAGT TGGCAACATG GCTCATAGAG CTCCCAGCGA GTCAGGTCAT 1080 

GAGTGCTTTG GGGGAGAAAG GGGAATGTTA TACTGGAAAA GAACAGAGGG AACC^lACTCC 1140 

ACAGACACCA GTAAAAACGG GATGGGGAAG AGGAGGAAAG CCACTCACTT GTAGAAGGCA 1200 

25 

GAGAGGCGTT TCAGAGTGGC TGCCAGATTA TATACCTCAT CCTCATCTAG GAAGGACGAC 1260 

TGAGAAGGAA AGAAGATCCA CAATAGCATT TCCCCCAGAA CTCATCAGTC CACATCCCCC 1320 

30 GTCTTGCAGC CCCTCCCACC CTTGTTTGGG GTGTCCCATr GTCCAGCCCC AGCTCCTACC 1380 

TGTAACAGCT CTTCAAGCTC CTGCTGGAAR CGGTCAGTCA GCAAATCTAC TAGCTGGCTG 1440 

CGGGCAAAGT CCGCCCGGCT GAAGAAAGTG AATTCGGGAT TACAGAGCAG GTAAGAGCAT 1500 

35 

GCGCCCCAGC CTCAAGCACC GCTGGCTCTG CATGCTTCAC CACCACCTCC TGGAGTrGCT 1560 

GCAGGAACAG CTCCAGGTGC TGAGAAGAAA AGGCAGAAGA TGGTGTGCTG TGGGGATGGG 1620 

40 AGGAGGACAC TCTTCTGGCG GGAAGTGGAA CGGGGTTAAA AGCATTAAAC TTCAAGGATA 1680 

AGATGCCTAA RAAAAAAAAA AAAAAA 1706 



45 

(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

55 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 

GAATTCGGCA CGAGCTTGGT AGCCTTAGAA CTGCATGAGC TGCTTTACCA CTGGGAAACA 60 
CGAGCACAGC CTAGCTTGAT TTTGTATGTG GTATCAGATC TAAGGTGGAT GGAATTCAGG 120 

60 
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ACTTCCTGTC TACTCTTTGA TTTTGTTTTA 
ATTTATTCAT CTTCAGAGAC ATGGTCTGGC 
5 ATCATAGGCC ACTGCAGTGT TGAGCTCCCG 
TTAGTAGCTG GGACTATAGG CACATGCCCT 
GATGTCYCAA ACTAGAAGGT CTATTAATTT 

10 

AAATAATAAC AGTGGGAAAA GGCACCTTCC 
AAAACGAAAA ATAAATAATA GGAAAAAAAG 
15 AAAAAAAAAA AAAAACTCGA GGGGGGCCCG 



TTITTAGAAA TGTTTTATTT TGTTTTATTC 180 

TCTGTTGCCC AGGATGGAGT GCATGGTGTG 240 

GGCTCAGGCG ATCCTCCTGC CTCAGCTYCC 300 

ACCATGCCTG GCTTTGTCTA CTTTTTGAAT 360 
AAAAAATTAA GGATAGCATG CCATAATTAA - 420 

AATGATTCAG ACATCAACTT GTGATTTAAA 480 

GGGAAAAAGT TAAATAAAAA TAAAATTAAA 540 

GTA 573 



20 (2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 684 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 





(xi) SEQUENCE DESCRIPTION: 


: SEQ ID NO: 85: 






30 


CTCTTTGGCT GTGTCTACCT CCTTCATCTG 


CTGCGCCGAC ATAAGCACCG 


CCCTGCCCCT 


60 




AGGCTCCAGC CGTCCCGCAC CAGCCCCCAG 


GCACCGAGAG CACGAGCATG 


GGCACCAAGC 


120 




CAGGCCTCCC AGGCTGCTCT YCACGTCCCT 


TATGCCACTA TCAACACCAG 


CTGCYGCCCA 


180 


35 












GCTACTTTGG ACACAGCTCA CCCCCATGGG GGGCCGTCCT GGTGGGCGTC ACTCCCCACC 


240 




CACGCTGCAC ACCGGCCCCA GGGCCCTGCC 


GCCTGGGCCT CCACACCCAT 


CCCTGCACGT 


300 


40 


GGCAGCTTTG TCTCTCTTGA GAATGGACTC TACGCTCAGG CAGGGGAGAR GCCTCCTCAC 


360 




ACTGGTCCCG GCCTCACTCT TTTCCCTGAC 


CCTCGGGGGC CCAGGGCCAT 


GGAAGGACCC 


420 




TTAGGAGTTC GATGAGAGAG ACCATGAGGC 


CACTGGGCTT TCCCCCTCCC 


AGGCCTCCTG 


480 


45 












GGTGTCATCC CCTTACTTTA ATTCTTGGGC 


CTCCAATAAG TGTCCCATAG 


GTGTCTGGCC 


540 




AGGCCCACCT GCTGCGGATG TGGTCTGTGT 


GCGTGTGTGG GCACAGGTGT 


GAGTGTGTGA 


600 


50 


GTCACAGTTA CCCCATTTCA GTCATTTCCT 


GCTGCAACTA AGTCAGCAAC 


ACAGTTTCTC 


660 




TGAAAAAAAA AAAAAAAAAA AAAC 






684 



55 

(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 
60 (A) LENGTH: 1036 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS ; double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 
TGGAGGCAGA TGCACAGGAG AAAGGTTCCC GTCCGCACCC TCTCAGACCT GAGGCTGAGC 
TTGCAGTGAG GGCTTCTCCT CGGCCCCTCG CCCGCCCCCA GAGCTGCCAT CCCTGCTGTT 
ACAAGCCAGA GGAGCCCGGA TGTGAGGCCC CAGATCACCT CCAGGGACTT GGGGTTCCCA 
TCTGAAATCC TTTATTTTTG TACCATGGGG TGGGCCCCGG GCTGAGAAGG AAGAAGCACC 
CTCTCCCCGG CCTCCTCTGT CTGCACCCGT GGGGCTGTGA CTTACTCCTG CCTCCAGGGG 
CGGGGCGGGG CCCCCTGGGA CCTCTTAAGG CCCAAGGTGG GCCCCAGGAC CTYTGGGCAG 
AGTCGAYTGC TCATGGCAGA TGTGTGGCAA TGTCTGGCTG WGTCTTTCCG GCAMCTGCGT 
YCCCTYTCCC GGGYTCCCCT GCTGCATGGT GGATGTGCTC CTTCCTGGCC CGGTCACATT 
GCCTCCTTGA GCCTTAGTCC AGGGGGTCAC TYCTCCCACC CCACCTACCT CACAGGGTTG 
TTGTGAGGGT GCACAGAGGA GCAAAGTCCC TGAAGGCCCT CAGGCAGTAT ATAGGGGCCG 
CCCACCTTCA GCTGCCCTGG GATGGGAAGG ACCCAGCCCG ACCCCTGGGC ATAACACTGT 
GTTTGCAAAT GGAGATTCAG GTATTGGGGA TGCAGGTTGT GGGGAGCTGG CCTGGCAGAG 
TAGGGGTAGT TGGCTTGGCC TTCTCTTTGG TGATCCCACC CCCAGCCATT TGCATTGCTG 
GCCCAGCGCC TGGCCTCGGG GGCGGGGAGA GGCAGCAGAA GGGGCTGGGC AGGGGCGGTG 
GAGGACTCAG GAACTGCCCG GGGAGAGTGG GTATGGCGGC TGAGCCAGGG GCCCTCCTGT 
GTTTCACTTC CCGGGATGGG TCCTTGCTTC TCAGCTGTGT CCGACCCCAC CATGTAATAA 
AACCCAAAGG AACAGCAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAN 
CCCNGGGGGG GNCCCG 

(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 908 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 
TTAAACAAAT GGAATCATGC AATATGTGAC CTTTTGCGTC TGGCTTATTT TATTTAGCAT 
AATGTTTTTG AGGTTCATCC AAGCTGTAGC ATGTATCAGC ACCTCATTTC TTTTTCTQGC 
TGAATATTAT TCCATTATAT GGATTTACCA CAATTCATTT ACCTATTCAT CrTTTGTTTC 



wo 98/54963 



341 



TGCTGTCTGG CTATTGTGAA TMTGCTTCG ATAAACATTC ATATACAAGT TTCTATGTGG 
CTTTATGTTT TCATTTCTCT TGGCTATCTA CATGGGAGTA GAATTCTAGG TCATAATATA 
ATTTTATGTT TAACTTCTCA AAGAATTGCC AAAAGGTTTT TCATAGTGGC TGCATCATTT 
ACATTCCCAC CGGCAATGTA CAAGGATTTC TATTTTTCCA TATCCTTGCA CTTACCAACA 
CTTCTTTTTK GTOATWATTT TGTTTTTTCA TTATTGCCAC CCTAGTGGAT GTGAAATGGC 
ATCTTATTGT TTTGATTTGC ATTTCTCTAA TGACAAATGA TATCATACTT TTTTTATGTG 
CTTACGGATC AAAGGTATTT CCTTGGAGAA ATGTCCCTTC AAGTCCTTTG CCATTTCAAA 
ATTTGGTTAT TTGTCTTTTA TTATrCAGTT TTAAGAAATT CTGGCCAGGC GCAGTGGCTC 
ACCTGTAATC MTAGCACTTT GGGAGGCCAA GGCGGGCAGA TCACTTGAGK TCAGGACTTC 
GAGACCAGCC TGGCCAACAT GGTGAAACCC CATCTTACTA AAAATACAAA AATTAGCTGG 
GCGTGGTGGC AGGTGCATGT AATCNTATCT ACTCAGGAGG CTGAGGCAGG AGAATCGCTT 
GAACCCAGGA GGCGGAGGCT GCAGTGAGCC AAGATCACGC CATTGCACTC TAGCCTGGGT 
GACACAGA 

(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 655 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: double 

(D) TOPOLOGY 1 linear 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 88: 
TGCACTGGTT CCTTCTCCCC AGCAAATACT GCCTTCTTGT TTTTCTCTGA TGTGGCAGGT 
GACTACAAAA TCCGCCTTGG TATTCTTCAA ATGCATATAT ATTCCTTTCT TGTCAGCTCC 
CTCTCTTCCT AGATTAGAAA ACTGCCTCAT TTTCTGCTCA CTGGATGTGC AGTCCCAGCT 
TGTCTTCCTC TCCTCCCCCC CTGTTGCAGG TGTTCTTTTT TTTTTTCTTC TCTCCCCACT 
GGGCAGCAAA AGTTGTTCCA CAGTGGAAAW TTAGGCATCC TCAAGTTTCY TCCCAGCTTC 
TGCTGTC?rTT TCTTAGAGTA AATTGCCAAT TTCTGTTTTT ACAGGAAATC CTTTTTTAAA 
AATGGAATCA GTGTGGTCCC CATCTACTCT GCAAAAATTG CATTTTrCTC TATTTTCAAA 
TGAGATTTGT TCAAGTTTCA AAACCACGTG AAATAATAAA TGTATAGTAG TTTTCTTTTC 
CTTGGGCATT GCTWGATATG TGAAATGGGT TTATGAAAAA TAATAAAATC ATAACGCTAT 
TTGTTTGACT TTCAATTTCA TGGGAATTTT TCTCAGCTAA ACTCTAAATG GTGATTARGC 
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AAAAAAAAAA AAAAAAAACY GRAGGGGGGC CCGGTACCAA TTCGCCCTAT AATGA 655 



(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1102 base pairs 
10 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



15 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 

TTTTTTTTTT ACCATTTAAA ATAAAATGAA AGTGACCTTC TGTTTATAAA AATCTTTGTC 60 

TGCATCTCTG CTTATTTCCT TAGAAGAGAT TCCAAGAAGC GGTGAGTGAT TTCACGGCAG 120 

20 CAGAGGGTTG GGACATATTA CGGGCGCGGA TCCCTCTTGG AGTGAGATGA CTCTCCGGAG 180 

AGATTTAGTC GTCACCCTCG CGTGTGAGGC TGCGTCACAC CCCAGGGATG TGTCTATCAA 240 

GATGGAAGAT CTTTTACACG CTCTTGATTT TGTTTGSCTY TTTTTCTATT ACTAGTGAGA 300 

25 

AKGAAACTTT TTATATGATT ATTATCCATC ATAATCCAAC ACAAATTACT GCTTCATGTT 360 

CTTTTACTTT CCTGTGAAGG TTTTAGTGCC TTTTAAAAAT TGCTATATAT TAAGCTTGIT 420 

30 AATACTTCCA TGCTGTATTT GTGGSCATCA RTTTCCCCGG GNACAGGCNT GCACATTTTC 480 

CCTTCACACG CTGGGTGGTT TTTCATTTTC AmTCTATTT CTCGTTCTTC TATCGrTTTA 540 

TGTTCAGACG GGTTTCTCCG TGTAGAAAGC AGTTTATGAA GATTTACTTT CGACAGTCTT 600 

35 

CTCTCTACTT TCTACAGTGA ATTCTCTGAT GTGTCTGGGA GTTTGGGGGT CTGGGTAAGA 660 

RTCCTCCTCT CACCCTATTC TCTATTACGA TCCACAGCCT CATGCTTTAT GARATTGGTC 720 

40 GCCGGGARCG GGGGAGATTT GCGGATCCCC CAAGCCAGAC TTTATCCCCC TATCCCTGCC 780 

TCTGGATCCC ACGTACAGGC CTGGGAACTC CCTGTGGGTA GGGGCCAATG GTCTCGCACT 840 

CTCACCTGTA CCCCAGGGCT GGCACAGGAT GGTCAAGGAG AGAGGCTGCC CAAGCGCATC 900 

45 

CYTCTGGTGT CCCCCTGACA CGCCTCCAAA GTGAGCAGGT AGGTTTCAAC AGCCCCACGT 960 

TGCAGGTGGG AGATGAAGCT CAGGGTGGAG ACCAGTATCT CACAGTTCTC TTTGCATGGC 1020 

50 CGGGTACTTG TTAGTCAACT GATCAAGTGA AAATTCTAGC CCCAGAGGCA GGAGAATCCG 1080 

GAACAAAATT AAACCAGCCA GG 1102 



55 



60 



(2} INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1533 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS ; double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 

GGCACGAGCC GNCACGGGCA GCGCCCCATA GCGCCAGGGA CCCCCTGGCA GCGGGAGCCG 60 
CGGGTCGAGG TTATCGATCC AGCGGGCGGC CCCCGGGGCG TGCTCCCGCG GCCCTGCCGG ■ 120 

TGNCTGGTGC TGCTCAACCC GCGCGGCGGC AAGGGCAAGG CCTTGCAGCT CTTCCGGAGT 180 

CACGTCCAGC CCCTTTTGGC TGAGGCTGAA ATCTCCTTCA CGCTGATGCT CACTGAGCGG 240 

15 CGGAACCACG CGCGGGARCT GGTGCGGTCG GAGGAGCTGG GCCGCTGGRA CGCTCTGGTG 300 

GTCATGTYTG GAGACGGGCT GATGCACGAG GTGGTGAACG GGCTTCATGG AGCGGCCTGA 360 

CTGGGAGACC GCCATCCAGA AGCCCCTGTG TAGCCTCCCA GCAGGCTCTG GCAACGCSCT 420 

GGCAGCTTCC TTRAACCATT ATGCTGGCTA TRAGCAGGTC ACCAATGAAG ACCTCCTGAC 480 

CAACTGCACG CTATTGCTGT GCCGCCGGCT GCTGTCACCC ATGAACCTGC TGTCTCTGCA 540 



10 



20 



40 



600 
660 
720 



25 CACGGCTTCG GGGCTGCGCC TCTTCTCTGT GCTCAGCCTG GCCTGGGGCT TCATTGCTGA 

TGTGGACCTA GAGAGTGAGA AGTATCGGCG TCTGGGGGAG ATGCGCTTCA CTCTGGGCAC 

CTTCCTGCGT CTCGCAGCCC TGCGCACCTA CCGCGGCCGA CTGGCCTACC TCCCTGTAGG 
30 

AAGAGTGGGT TCCAAGACAC CTGCCTCCCC CGTTGTGGTC CAGCAGGGCC CGGTAGATGC 780 
ACACCTTCTG CCACTGGAGG AGCCAGTGCC CTCTCACTGG ACAGTGGTGC CCGACGAGGA 840 
35 CTTTGTGCTA GTCCTGGCAC TGCTCCACTC GCACCTGGGC AGTGAGATGT TTGCTGCACC 



900 



CATGGGCCGC TCTTCCAGCTG GCGTCATGCA TCTGTTCTAC GTCCGGGCGG GAGTGTCTCG 960 

TGCCATGCTG CTCCGCCTCT TCCTGGCCAT GGAGAAGGGC AGGCATATGG AGTATGAATG 1020 

CCCCTACTTG GTATATGTGC CCGTGGTCGC CTTCCGCTTG GAGCCCAAGG ATGGGAAAGG 1080 

TGTGTrTGCA GTGGATGGGG AATTGATGGT TAGCGAGGCC GTGCAGGGCC AGGTGCACCC 1140 

45 AAACTACTTC TX3GATGGTCA GCGGTTGCGT GGAGCCCCCG CCCAGCTGGA AGCCCCAGCA 1200 

GATGCCACCG CCAGAAGAGC CCTTATGACC CCTGGGCCGC GCTGTGCCTT AGTGTCTACT 1260 

TGCAGGACCC TTCCTCCTTC CCTAGGGCTG CAGGGCCTGT CCACAGCTCC TGTGGGGGTG 1320 

50 

GAGGAGACTC CTCTOGAGAA GGGTGAGAAG GTGGAGGCTA TGCTTTGGGG GGACAGGCCA 1380 

GAATCAAGTC CTCGGTCAGG AGCCCAGCTG GCTGGGCCCA GCTGCCTATG TAAGGCCTTC 1440 

55 TAGTTICTTC TGAGACCCCC ACCCCACGAA CCAAATCCAA ATAAAGTGAC ATTCCCAAAA 1500 

AAAAAAAAAA AAAAAAAAAA ANCCCGNGGG GGG 1533 



60 
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(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 575 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 
ATCCTCTGGA ATCTAGGTGG AAGCCACCAA GCCTTCTTCA CACTTGCGTT CT^^AGCATCT 
GCAGACTTAA CCCCATGTGG CAATCACCAA GGCTTATGGC TTGTCTCCTC CAGAACTGTG 
GCCAGAGCTG TACCTGGGCC CCTTTGAGCT GAGGCTGAAG CCAGAGICTC AAC-CTCAGCA 
GGGCAGTARG GCCCTGGGCC TGGCCCCTGA AACCATTCTT TTCTCCTAAG CCTCTGGGCC 
TTTGATGGGA RGGGCTGTCC TCAAGATTTT TGAAATGCCT TrOGAGGGTr riTGCCTTGT 
CTTGGATATT GGCTTCCTTT TAGTTATGCT CATCTCTCTA GCAAGTGAAT CTTTCACAAC 
CTGCTTGGAT TCTTTCTCTA CCACAGARCC AGGCTGCAAA TTTTTWIIAAAC mTACACTC 
TGTTTCCCTT TTAAATATAA ATTTCAATGT TAAGTCACTT CTTTGCTCCC ATATCTGATT 
TAGGTTGCTG GAAGTAGCCA AGTCACCTCT TGAATGCTTT GCTGCTTAGA AATTTCCTCT 
ACTAGGTAGC CTGGGTCATC ACACTTAAGT TCAAA 

(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH:' 639 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 
TCCTTTCATC TTAAGCACCA CCCGACAGGG CAGGTACTAT TACCATCTCC GmGACAGA 
TNAGGAACCT GGCACAGGAA GCATTTAAGT GGATTCCCCA GGATCGCCCC ACTGTCAGGA 
GCAGANTCAG AATGGGCCTC AGCATCAGGC TCCCAATCCT GGCTTCTAAC TGCIGCGCTC 
TGCCCTTCYC TCWCCCCACC TCCCCACTCC AGTGCCTTTG GTCATGCCAC TCCAGCTTTC 
AGGCCAATAC TGGATTAGCC TCTTAGTGTT CTTGTCCCTG CAGCCATTTC CCCAGGCAGC 
AATTCCATGT GCCCTCACTG ATGTAGGTGG CTCTTGTGTC ATTTGTCACA TCCTATTCAA 
TTGTTTATGC ATCTTGTTCA CACTCACAGC ACCCTCCCTC TCACACGTCC TCCTTATAAA 
AATGTCCCTC AGTGTCTGCT ATGAGCCAGG TGCAGACTTA AGTGACAGGG CTGCTACGGG 



wo 98/54963 



PCT/US98/11422 



AAATAAAAAA TTAACAAGGA GCACCTGCCT 
GTCAGGAAGG AAAGGTTAAG GATGCCAGGA 
5 GGCAGGTGGT GCTGARGATT AAGAACGTGT 



345 

CTTAATGCAC AGTAACAAAC TATGTTAAGT 540 

AGGCTTTTAA TAAATAACCT GACTTAGATG 600 

TCTTCTCGA 639 



25 



10 (2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 744 base pairs 

(B) TYPE: nucleic acid 
15 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 

20 GAATTCGGCA CGAGAGIGGC TGGAGTCTGG CTGCAGAGGG AAGACATCAG CAGGGAGGGA 60 

GCCAGGGCCT GTCACATCTT TCCTCTGGCC ATTGTCCTGG TCTTTGTAAG CCCAGAATCT 120 

CCCCTTCCCT GAAGGGAGGC CAGCACCCCA GGAGGGCAGC AGGTCTGCTG TGAGGGTTGG 180 

AGTAGTCTCA GAGGTCAGGG TACACTAGAA TGGCCATGGA CACCATGTGG GGGTGCTCTG 240 

GGCTGGGCCA CAGAACAGTG TCCTTCCTGC TGCTCCTCCC CTGCAGCTTC CCCCGACCTT 300 

30 CTNGTITArr TGGTTTGATA CCAATCAGCA GACCCTGCAA GGTGGAAGCT CCCAGGCTCT 360 

CAGTCCCACS ACTCTCATGT GCCAGTCACC CNTACTGTAA CTGCCCAATG AGTACTTCTT 420 

GCCCACTGCC AAGATAGAGC CAGTrTACCA AGACAGGGGA ATTGCAGTAG AGAAAGAGTT 480 

GAATATACAT AGAGCCAGCT AAATGGGAGA GTOGAGTITT CTTATTACTT AAATCAGCCT 540 

CCCYTAAAAT TCAGAGGTCA GAATTTTTCA AGGACAGTTT GGTGGSCAGG CCTAGGGAAT 600 

40 GGATGCTGCT GATTGGCTAG GGATGCAATC ATAGGGGTGT AGAAAAGTWC CTTGTCCACT 660 

GAGTCCACTT TTCGTGAGAG CTACCAAGGA GCTGCTGGTC TGCTGGTCCC GGTAGAGCCA 720 
TCTGGTGTCA GGAATGCAAA AGTG 



35 



45 



50 



(2) INFORMATION FOR SEQ ID NO: 94: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 526 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
55 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 

GCAGGGGAAT TCGGCCACGG AGGGGTITCA ACAGGGCCCG TGGGGTGAGG TGCARACACA 

60 



744 



60 
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AAGCCCATAA GTGCTGGCCT GTTGGGACAA ATGAGAGAAA TCCCATAGGG TGGTCATGAC 
AGCGCAYTCA GCCATCYTAY TCCTGGGGAA AATGAAACTT GTGCTCCTAT CAAATGCTCA 
GTTGTAAAAC TGGAAAAAAA TTTTAGAAGA CATCTTGTCC AGCATCTCTG TTTATCTCTA 
TAAAATGTAG AAAACTAAAG CACAGAGATG TTAAATGTTT TGTCCAAGGT CCAACAGCTG 
GTTAGCARGC TTGGTCTGGT GACCTTTCTA CTGAACCACA GTGCCGCTGG GGGAAGTCCT 
CAGCACAGAT GGCTGCTGCT ATAGCTGGGG TATQGGCAGT ATTAGTAGTT AACCAGTCAA 
CCCAAGTTCC CATAGTCTAG GTTCTGCTTC AGCTGGAGGT TAGGGAAAAA CACAAGAAAA 
TCCCTTACCA CTCTACCAGT GCTGGGGGAT GTACTAAGAG ATCCCC 

(2) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 426 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 
GGCACAGGGC AGGAGAGACT TGGTCCATGG GGAGAAGCCT GCAGTATAGA OGGGACCTCC 
AGGAGCCCAA GTAGCATAGA CCCTGCTGAT CCGGGGCCAT TGAGCCAGAG GATTTGGGCT 
GAATGTCCCC AGAGACAAAA GGGAAAGGTA GATCCTTTCC CTTAAAGATG AAAGCCATCG 
CCCGGGCTTG CTTATTGCTC TCTCTCCTGG TCCTTCCACA TCTTGTTTCT GAACATTTGT 
TCTGGCATCA CAATCCCCGT CATCCTGTCA TCTGGCCCTT CCCACCTTTC CACCTTATCT 
CTTGCAGTGT CTCCGCGTCG ACCTGGCACC TGGGTGAARG CTTGCTCTTC CTGGTGCCCA 
TAGCCCCCAG TGTATGGTCT TGAMCTCCCC AGCCATA'TCG ARACCCACCT' CAGGAGGGCC 
CCTCGA 

(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 844 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 
GGCACAGCGG CACGAGATAG GAAGCTTGGC AGGGGCAGCT CCCCCAGTGC GCAITGCCCT 
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GTAACTCGAG CGCCTGGGAG TGGGGAGAGG CTTGGAAATG GAGCAGGGTG GTGGACCTCG 120 
TCrrCTCCTG CTCATCCCAG GCCTCCTCCA TAACACCTAC CTAGCACGGC CTGGGGACTT 180 
CCCAGCCCAA GGAACAACTG AGAATACTGA GTGCCAGGGT AGCCCTAGCC CCATTTCACA 240 
CCTGGGCAAA GTOAGGTCAC TGGA1TCAAA CACTCAGATT TAAACCTCCT CTGTGTCTGC 300 
AGCACCTCTA TATAACTCCC AGCCTCTGCT GCCCCTCTCC AAAAAGTCTC TGCCCTTGTC 360 

420 



(2) INFORMATION FOR SEQ ID NO: 97: 



TTTGGCACCT GTCTCTGTCC TCCCCATTCT CTGCTCCTCC TTTCTCCAAC TCAGANTCAC 

CCTGTTAGrr CAGCAAATGT TCATCGAGCT CCATAATGTA GCAGGACAGG NCTGTCTAAC 480 

15 AGATTCTCGN CTTGCAAGGG TGAGACAAGT ACTCTCCATC TTTCTCTCAT CTTCACAGAT 540 

GGTCTCCTCA ACAACTTTGC ACTGAATTGT AAATAATTGA TACTGCATAA AACATTGATG 600 

TTCnTAAGG GTAGTCCAGC AAGGTGGCAA GTCTTATAAT GATAACTGCT CAAGGATCTC 660 

TCAGTCAAGC ATTTCGGGST GCTAGCTCTG CCTATGGGTG AGGTCAGCTA TCTCACGCCA 720 

TCTACTTCCA CNTGCCCCCC CATGCCAGGC TCACCCTGAG CTGAGATGCC TGAGCAGGTG 780 

25 GCAGAAAGGA GCCACCTCGT TTATGCTTCG GGACCACAAA CTCCTCTATC CAGANGACAG 840 
TTTT 



844 



(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 1985 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDESNESS : double 

(D) TOPOUXSY: linear 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 

AGCCCTGCTG AAGTACAGGT TCTTCTATCA GTTTCTGTTG GGCAATGAAC GAGCAACAGC 60 

AAAGGAGATC AGGGATGAAT ATGTGGAGAC GCTGAGCAAG ATTTACCTGT CTTACTACCG 120 

CTCTTACCTG GGGCGGCTCA TGAAGGTGCA GTATGAGGAA GTCGCTGAGA AAGATGATCT 180 

AATGGGTGTG GAAGATACAG CAAAGAAAGG ATTCTYCTCA AAGCCATCGC TCCGCAGCAG 240 

50 GAACACCATT TTCACCCTAG GAACCCGCGG CTCTGTCATC TCCCCCACTG AACTTGAGGC 300 

CCCCATCCTG GTGCCTCACA CAGCGCAGCG GNAGAGCAGA GGTATCCATT TGAGGCCCTC 360 

TTCCGCAGCC AGCACTACGS CCTCCTAGAC AATTCCTGCC GCGAATACCT TTTCATCTGT 420 

GAArmTTG rroiCTCTOG CCCAGYTGCA CACGACCTGT TCCATGCTGT CATGGGCCGT 480 

ACACTCAGCA TCACCCTCAA ACACCTCGAT TCTTATCTAG CTGACTGCTA CGATGCCATT 540 

60 GCTGTnrrc TCTGTATCCA CATTGTTCTC CGGTTCCGTA ACATTGCAGC AAAGAGGGAT 600 
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GITCCTGCCC TGGACAGGTA CTGGGGAACA GGTGCTTGCC TTCCTATGGC CACGGTrTGA 660 

ACTGATCCTG GAGATGAATG TTCAGAGCGT CCGAAGCACT GACCCCCAGC GCCTAGGGGG 720 

C5TTGGATACT CGGCCCCACT ATATCACACG CCGCTATGCA GAGITCTCCT CCGCTCTTGT 780 

CAGTATCAAC CAGACAATTC CTAATGAACG GACCATGCAA TTGCTGGGAC AGCTCCAGGT 840 

GGAGGTGGAG AATTTTGTCC TCCGAGTGGC AGCTGAGTTC TCCTCAAGGA AGGAGCAGCT 900 

TGTGTTTCTG ATCAACAACT ATGACATGAT GCTGGGTGTG CTCATGGAGC GGGCTGCAGA 960 

TGACAGCAAA GAGGTTGAGA GCTTCCAGCA GCTGCTCAAT GCTCGGACAC AGGAATTCAT 1020 

TGAAGAGTTG CTGTCTCCCC CITTTGGGGG TTTAGTGGCA TTTGTCAAGG AGGCTGAGGC 1080 

TTTGATTGAG CX5TGGACAGG CTGAGCGACT TCGAGGGGAA GAAGCCCGGG TAACTCAGCT 1140 

GATCCGTGGC TTTGGTAGTT CCTGGAAATC ATCAGTGGAA TCTCTGAGTC AGGATGTAAT 1200 

GCGGAGTTTC ACCAACTTCA GAAATGGCAC CAGTATCATT CAGGGAGCGC TCACCCAGCT 1260 

GATCCAGCTC TATCATCGCT TCCACCX3GGT GCTGTCCCAG CCGCAGCTCC GAGCCCTCCC 1320 

TGCCCGGGCT GAGCTCATCA ACATTCACCA CCTTATGGTG GAGCTCAAGA AGCATAAGCC 1380 

CAACTTCTGA TGTGCCAGAA ACCGCCCTGA GATCTGCCGG TCATCTCCAT GGACTTCTGC 1440 

ACCCCATTCC ATACCCTTCT TCACCTGGGG TACCCCTTCC AGTTTTCCCC TTGCTTCCCA 1500 

GGCCCTTGAC ATGGCTTACC TGCCTTCACT CCCAGCACCT TGCCCAACAG GATAAGCTGG 1560 

ATCCCCTTGG CCTTCTGAAT ATCCCAGTGT CTTCAGGTTT CCCAAGACCA CTTCCCTCTG 1620 

GGCTTCCAAA ATGGCCTTTA TCATTTCTCC AGTCTGTCAC CCTCCTTTCC TGCTCCCATA 1680 

CACCCAAGGC TTGTTTCTTC CCCTGTAAAA ACCACTGCCT CAATCTCTGG TTCACTCAAC 1740 

TAGTCACCAT GTCCTGAGGC ATGAAGCCTC CTCAGCTCTT GGAATTGCTC GCAAGGGGTC 1800 

ACTGCCTCTG AGTCATTGTG TTTTTCAAAG TGATTTCTTT TCTGTAGCTT TTTGACCTAA I860 

GATCTCAGCA ATTTGAACAC TAACCTCTCC CCTCCTGGCT CAAGAATTAC TCCGAAGTCA 1920 

GTCTGCAGAA AATAAATATT TAGTATGACA TGAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1980 

AAAAA 2.985 



55 



60 



(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1416 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

ATATGAAGGG AAAGAATTTC ATTATGTTTT CTCAATTGAT GTCAATGAAG GTGGACCATC 60 

ATATAAATTG CCATATAATA CCAGTGATGA CCCTTGGTTA ACTGCATACA ACTTCTTACA 120 

GAAGAATCAT TTCAATCCTA TGrTTTCrGGA TCAAGTAGCT AAATTTATTA TTGATAACAC 180 

AAAAGGTCAA ATGTTGGGAC TTGGGAATCC CAGCTTTTCA GATCCATTTA CAGGTGGTGG 240 

TCGGTATGTT CCGGGCICTT CGGGATCTTC TAACACACTA CCCACAGCAG ATCCTTTTAC 300 

AGGTGCTCGT CCTTATCTTAC CAGGTTCTGC AAGTATGGGA ACTACCATGG CCGGAGTTCA 360 

15 TCCATTTACA GGGAATAGTG CCTACCGATC AGCTGCATCT AAAACAATGA ATATTTATTT 420 

CCCTAAAAAA GAGGCTGTCA CATTTGACCA AGCAAACCCT ACACAAATAT TAGGTAAACT 480 

GAAGGAACTT AATGGAACTG CACCTGAAGA GAAGAAGTTA ACTGAGGATG ACTTGATACT 540 

TCTTGAGAAG ATACTGTCTC TAATATGTAA TAGTTCTTCA GAAAAACCCA CAGTCCAGCA 600 

ACTTCAGATT TTCTGGAAAG CTATTAACTG TCCTGAAGAT ATTGTCTTTC CTGCACTTGA 660 

25 CATTCTTCGG TTOTCAATTA AACACCCCAG TGTGAATGAG AACTTCTGCA ATGAAAAGGA 720 

AGGGGCTCAG TTCAGCAGTC ATCTTATCAA TCTTCTGAAC CCTAAAGGAA AGCCAGCAAA 780 
CCAGCTGCTT GCTCTCAGGA CTTnTGCAA TTGTTTTGTT GGCCAGGCAG GACAAAAACT 
CATGATOTCC CAGAGGGAAT CACTGATGTC CCATGCAATA GAACTGAAAT CAGGGAGCAA 
TAAGAACATT CACATTGCTC TGGCTACATT GGCCCTGAAC TATTCTGnT GTTTTCATAA 



840 
900 
960 



35 AGACCATAAC ATTGAAGGGA AAGCCCAATG TTTGTCACTA ATTAGCACAA TCTTGGAAGT 1020 

AGTACAAGAC CTAGAAGCCA CTTTTAGACT TCTTGTGGCT CTTGGAACAC TTATCAGTGA 1080 

TGATTCAAAT GCTCTACAAT TAGCCAAGTC TTTAGGTCTT GATTCTCAAA TAAAAAAGTA 1140 

TTCCTCAGTA TCAGAACCAG CTAAAGTAAG TGAATGCTGT AGA1TTATCC TAAATTTGCT 1200 

GTAGCAGTGG GGAAGAGGGA CGGATATTTT TAATTGATTA GTGTTTTTTT CCTCACATTT 1260 

45 GACATGACTG ATAACAGATA ATTAAAAAAA GAGAATACGG TGGATTAAGT AAAATTTTAC 1320 

ATCTTCTAAA GTCOTGGGGA GGGGAAACAG AAATAAAATT TTTGCACTGC TGAAAAAAAA 1380 

AAAAAAAAAA AAAAGGAAAC TCGAGGGGGG GCCCGG l^^^ 

50 



(2) INFORMATION FOR SEQ ID NO: 99: 

55 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1935 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
60 (D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 99: 

^ NTCTACCCTA ATCAAGATGG GGACATACTT CGCGACCAGG TTCTTCATGA ACATATCCAG 60 

AGATTGTCTA AAGTAGTGAC TGCAAATCAC AGAGCTCTTC AGATACCAGA GGTTTATCTT 120 

CGAGAAGCAC CATGGCCATC TGCACAATCA GAAATCAGGA CAATAAGTC5C TTATAAAACC 180 

10 CCCCGGGACA AAGTGCAGTG CATCCTGAGA ATGTGCTCTA CGATTATCAA CCTCCTCAGC 240 

CTGGCCAATG AGGACTCTGT CCCTGGAGCG GATGACTTTG TTCCTGTCTT GGTGTTrcTC 300 

TTGATAAAGG CAAATCCACC CTGTTTGCTG TCTACTGTGC AGTATATCAG TAGCTTTTAT 360 

GCTAGCTGTC TGTCTGGAGA GGAGTCCTAT TGGTGGATGC AGTTCACAGC AGCAGTAGAA 420 

TTCATTAAAA CCATCGATGA CCGAAAGTGA CCAAGACCAA GGCCCACCAA GGCAGCAGAC 480 

20 TGTTAATCAG ACAAACAGAT CTCTGAGAAG GTGCATCAGC TGCTTTGAAG GCTGAAGATT 540 

GTTTTGTATG ATACTGCACA GCATCAGGCA TTTTAAAGCA GATCTTTACT AAACAGGTTA 600 

25 ^'^^^^^ AAGCAGGTTC TCTCGTCTTT GGGCTCTTTC CTTTCTGAGT TGCATATTCT 660 

ATTTTCTTGT CCCCAAGTAG AGACTAGTAC TACAAAAAGG GACCACATTT 1TCAAGTATT 720 

TCTAAGTATA AAAAACAAAA CAAAAATCTC TTAGGAAATG TCTAGACCTC CATTCTTGGA 780 

30 TTCCCTTTCT TTCCTTTTAT TTTAAAAAAG AACAGTACCC CTCTTTTAAG ATGCTGTCrr 840 

ACATTAATGA GCATCTAATG GAAAGAAGGT ATGAGTTGCA CTGAGGATTA GAATAGOCGT 900 

GCGTTAGTGG CATTATCTAT AAATACACTC ACCTAAATTG AAAGCTAAGA AGGAAATOTA 960 

AATATAATAT ATATTTATAT TTGATGTAAT ATGGACATCT GCAGATTCTA ATAAACAAGG 1020 

ACTATTGCTG ATAGTAGGCT GTGACATACT GTCTTGTGAA ATGGTTTCCT TGACAAAATT 1080 

40 TAAGCTGAGC TTAAAAGCAA AAAAACAAAA AGTACACAGA AATATTTATT AAAATGTAAT 1140 

ACAGTTTATT GAACTTTCTA GGTATGGAGT TTGATGGACA GGGCTGCCTY TAATCAGTGT 1200 

GAAGGTCACT AAGTCACTTA GACATCTCAC CGTQGAAGTT TGTGAGCCTC CATTAGGAGA 1260 

TAGACTGATT ACCATACATG ACATAAAAAG GAACAGTGGA TAGCTCATAC TTTATGGTGG 1320 

TTCTTCTCCT CCGAAATAAT ATACTGCAGA AATCCCAGAC AGAGCTCCTT ACAAACCTTT 1380 

50 AATTGTAATA TATTTTTGAT GATTATTCAC ATTGAATGCA CAGACCAAGA ATTCAGTCAA 1440 

TGTCATTTTT TAAAAAACTA ATTTGTATTG TCTGCTCTAG TGATACAAGT TTTACTAGTG 1500 

ATAAACTATT TTAATCAACC ATACTATTCT TATGGAAAAA AATATCTATT TTGGCAGGTT 1560 

TCTGTGCCTT TATTTCCCTC TTCTGAAAAA AAGTCTGTGT TTTCATAGTT TGGITTGCAT 1620 

TGTATATCAA TAATTAATCA GGAATGGGTT TTGGTGCCTG AAAAATTGGC CATX3GAGGCA 1680 

60 CACCAAAGCT TCAAGCACAA GTCTTGTACA TGGGCCATCA CTGTCTGGTT TCACTTCGTG 1740 
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TGTTTCCTAA ACACATTTAG CTGCTTTTTT AACAAACTCA GCCCCATACT TGAGTCCCTT 1800 

GTTGTTGGGA GCATTTCCAG GCATCTTTTA AGGGAACTGT GACAAACAGC CTCGGGCAGA 1860 

5 

TGAACACGGA GGCTCTCTGT TGTCTGTCTC TGAGATCTTT GTGTCTGGGA ATGCCTAAAG 1920 

NTTTTGNTTT TTTTT 1935 

10 



(2) INFORMATION FOR SEQ ID NO: 100: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 599 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLXXTfT: linear 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 
GAATTCGGCA CGAGCGTCCA CGCAGCCGCC GGCCGGCCAG CACCCAGGGC CCTGCATGCC 60 
25 AGGTCGTTGG AGGTCGCAGC GAGACATGCA CCCGGCCCGG AAGCTCCTCA GCCTCCTCTT 120 
CCTCATCCTG ATGGGCACTG AACTCACTCA AGACTCCGCT GCCCCCGACT CCCTGCTGAG 180 
AAGTTCAAAG GGCAGCACGA GGGGGTCTTT GGCTGCTATT GTCATCTGGA GGGGGAAGAG 240 

30 

TQAGAGCCGG ATAGCCAAGA CCCCAGGCAT TTTCAGAGGT GGCGGGACCT TAGTCCTACC 300 
CCCAACACAC ACCCCTGAGT GGCTCATCCT CCCTTTGGGC ATAACGCTGC CCTTGGGGGC 360 
35 TCCAGAAACA GGCGGTGGGG ATTGTGCCGC TGAGACCTGG AAGGGCAGCC AGCGTGCCGG 420 
CCAGCTGTGT GCATTGCTGG CTTAATATGC AGGGCTTGGG GGGCTGTGGC CACATGCCCG 480 
GCAGGAGGTG AGTCAGGAGC CCTGTGGCGT GCTGGTGTGG GGATCGTGGG CATTTCAAAC 540 

40 

GGGdTGTCG TACCCTGAAC AATGTATCAA TAGAGAAAAA AAAAAAAAAA AAAACTCGA 599 



45 

(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LE^3GTH: 784 base pairs 
50 (B) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

55 

GAATTCGGCA CAGAAAAAAA AGAGAGACTG GGTCTTACTG TGTTGCCCAG ACITGTCTTG 60 
AACTCCTGCC TCAGCCTCTC AAGTACTTGG GATTATAGGC CAAGAAGCCA CCATGCCTAG 120 
60 CTTCTTCCTG TCATTGATCC AGACTAATAC TCTGGGGTCA GCCTCATTTC TTCTCTTTCT 180 
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10 



20 



25 



30 



352 



CACTTTGCAC ATCCACTTGr CACCAAATCK RGTrCATTCT GCATCCTAAG TAAGTCCTIT 
GATTCCTCCA GTTGTTCATT AGTAATGTCT CAARTCTAAT rTTTTCTAGT ACnTTTCAGC 
CTCrrCTTTCC KGCCTTCAGT CTTAACTTCT CCAGTACATA KGCCACATTC TTGTCAGCAK 
GATCAWATTT TATTTAAAAA TACTTTACAW AKGTTTATKG CCAAATATTA GRAAATACAG 
ATTCATGGAA AGAAAAATCA CTGTCCCAAG GAGGTCACTG GCATCGTCAG GTTAAGGGGT 
GATTTTAATT TTTAAAAATG TATATTTTTT CCTOICTAGA GTAGTAACAC CCTIGAAAAC 



GATATTTTAC AATTTCATTT ATCACCACCT TTCICTAGCC ITTACCCGTC TCTTCAATAT 
TWACATATGC AGAAGTTTCT CCTAACAAAC ACCTCCCTCT GCCTCACHTC TOTTACCACC 
CTGTTGCTTT CTTTCCCTTC ACAATCAAAT TTAAGAGTGT CAAAAAAAAA AAAAAAAAAC 
TCGA 



(2) INFORMATION FOR SEQ ID NO: 102: 



240 
300 
360 
420 
480 
540 



ACAVrnZCXTTT GTAAAGTCTC TAArrCTGTA CTCCGCATCT AGSTCRTCTC TTCTTTCTCA 600 



660 
720 
780 
784 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1035 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEIOTSS: double 

(D) TOPOLOGY: linear 

2^ SEQUENCE DESCRIPTION: SEQ ID NO: 102: 

AGAGGCCTGG CTGCGTTGCC CTATCTCCGT CTCCGCCACC CACTTAGCCyr TITAGGCATC 60 

AATTACCAGC AGTTrCTCCG CCACTATCTG GAAAATTACC CGATTCCTCC CGGCAGAATA 120 

CAAGAGCTTG AAGAACGCCG CAGTTGCGTG GAAGCCTGCA GAGCAAGGGA AGCAGCGTTT 180 

GATGCCGAAT ATCAGCGAAA TCCTCACAGG GTGGACCTCG ATAITTTAAC CTTTACGATA 240 

45 GCTCTGACTG CCTCIGAAGT TATCAACCCT CTGATAGAAG AACTTCGTTC CGATAAGTIT 300 

ATCAATAGAG AATAGTTAGG TGGTGACACT ACTTCAAGAG AACCTCTCCA TICCAGTCAT 360 

ACCAATCCTG CAACTTCATT TTCAGAAGTC AAGAGTATAT CGCGATAAGA CAGTGCACAG 420 

GTCGAGGGGA AAAAAAGGGG GAGGGGGAAG CTTATCITCA AAAAGCATCA CAGAAGTAGA 480 

AAAAAATGTC GAAAGCATTA TAACTGTAAC GTTCITIGAG TTTOTQATTG ATCCACATIT 540 
55 TTCCCCCTGC ATTATGGAAA ATGTCTCTCA GCATTGCITr ATTACAAACT AAAGGATCOT 
TTTATAAAAT TGAGACTGAT GAAACATCAA TACTAGAGCC CATCAGGATG AAAGAAATTA 
TCAAATAGTG CTGAACAGAA TAAGATGTTA ACGCTGAGTT ATTAGGACTC GAAGGCTATG 



60 



600 
660 
720 
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15 



25 



AAAAGAACTT GAAATTGTCG GAATATGTGC TCTCTTCATG TCATATTCAA TAGAAGTTTC 780 

TAGOTTAAGA TTGATTTTGT GTTTTCTTAG GCATTTCAAG TGACAAGCAA AGTAAATGTA 840 

TATATTATGT GATAAATCAT GTTTTCAAGA ACGTCAAATT TCTGGACriT TTOCTTTCAA 900 

TTTTTAATTT TTAAAGTTTT TTTGGTATTA AAAAATCYAT TCACAAGCCA AAAAATWTWT 960 

WAAATWTWCM GCGAAAAGCC AAAAAAAAAA AAAAMMAGGG GGGGCCGGGC CCCATCCCCC 1020 

CAAGGGGGTC CNGNT 1035 

(2) INFORMATION FOR SEQ ID NO: 103: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2218 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
CD) TOPOLOGY: linear 



{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 

AGGTATTAGG CCCTTTTGTG GGAGCCCCAT GTTTTGTTTT TCTGAGTTGG TGGGGAGGGA 50 

SGGAGGGGGA GGGCTGAATT GTTTTGCAGA GGAAGATGGC ATCTGTGCTT TAAATTTCTC 120 

30 ATTACTGGGT TAGAAAACAA AGAGGGAKTG CCCTGCACAT TTTCTTTTGT GCTTTTAAAT 180 

GTTTCTTAAG TTGGAACAGG TTTCCTCGGG CCTGTTTTGA CTGATTGCTG GAGTGCATTT 240 

GATAGTTAAA AATTACTAAT TGGTTTTATT TCCCTTCACA CTCTGCCTCC CCACTTCTCC 300 

35 

CCCCGTTACT GAAAAATAAC CATTTTAGTG TCAGGCTAGA AATTGAATTG CTGAGTTTTG 360 

TGTATCCTTT AAATTAAAAA CCACAAGTGT TTATTGTAGT GGTTAAACTG TAGCATCTCA 420 

40 GCATCTGGGT GGAAGCTGCC TATATTTCTT CCCAGTTTAA CTGGGGACCA TCTGTGAAAT 480 

TAATTTTCCA TCCAGACAGC TGCTGTGAGC AAATGAACAT AAATGCTCGC TGGAAATTTA 540 

CTAACCAGTT TTTATATTGA CCTGCAGTGT AAAAAGCACA TTTAATTATA AACAATATAT 600 

45 

TCAAAATGGG CAAATTTTAT TTTCAAATGC AGTGrrAGAGC TAGATTAAAA GCAACTCTTT 660 

GCCACCTACT CTGCCCTTTT GGCAAAGTTA CCTTGAACAA AGAATCTTAA GGGTTTATTA 720 

50 AGAACTCTTT ATTTTCTTCA TACCCTGTTC TCTGCAGTGC TTTCTAACAG CTTCTGGGTG 780 

CAGATTITCT TCGGCATCCT TTTGCACTCA GCTTATTACA GGTAGGTAGT GCTTAAGAAA 840 

AGTCATGGAG GACTAAAGCC TAAGTCCTTT TCACTTTTCC TCCATCTGAA GGTAGGTGAG 900 

55 

TTCATCCTCT TCATAGTAAT GCT G TTTTAC CAAGACTTTA TAGCAGATGG ACCCAGAAAG 960 

AATTTTCTCC TATTGTGTTC ACTACAACAG GATAGGGACA TCAGACAGCC CCAGAAACCC 1020 

60 CTTCCAGATC TGATATGGGA CTATTAATTT TTATGCTGTT AATTGGTATT CATTCACAAT 1080 
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GCAGTTGAAG GGGGAAGGCT CCACTGCATT CriTCGCTAA GGCCTCAATC CTTGCTCATC 
^ TOTAAGATCT ATACTCGAGG rrTTGTTTTC CTTTTAAAAT TCrTTAGGGA GAGAGGGATC 
GTTTCTGAGG GGTTCTCAAA GTATGATTCA ATGTGCAACA TACAGGTAGG TCTTCAGCAT 
AAGCTGAAAT ATATGCATGT AAAAACTTTG ACATCTTTTT rTTTAATITr CCACTTTCTT 
10 CrrAACTTTA CTTCTCTTTT TCTCCCCCCC CCATCTTACA GAAGTTCAGG CCAAGGGAGA 
ATGGTAGGCA CAGAAGAAAC ATGGCAAACT GCTCTGTCCT TTCAAACCAA AGTCTTCCCC 
CCAACCCCAA ATTTGTCTAA GCACTGGCCA GTCTOTTCTG GGCATTCTIT TCTACAACCA 
AATTCTGGGT TTirrTCTTC TrrCTTTAAA CATAGAGGTA CCACCACAAG GGATCCCCTA 
CTCTCTCGCA GOTCTTGAAA GCATCTGOTT GAGGGAAAGG TCTCTCGGCA AGCAAG1X3GT 
20 TATTTGGATT GCITCCTTCC CTTTTrCCAC CTTCGGACArr GYAATCATAA AATAACACTA 
AATTCCAAAC CTCAAAAACT ATTATGGCCT GAGCACAGCT GAAATCTAGC AGAGTTrAAC 
TCrrCTGCCT CCATGTCTGT CACTTATAAT TCAGGTTCTG CTCTTCGCTT CAGAACATCA 
GCAGAAGAAT CGTTTTATGC TAGTTATTGC ATTCATGGTT GAAACTCAAC TTAGGGAAAG 
GGTTCCAATG TATTAAGCAA TGGGCTGCTT CTCCCCAATC CTCCCTAACA ATTCGTTCTC 
30 TGGACTTCTC ATCTAAAAGG rrAGTGGCTT TTGCTTXSGGA TCAGTCCTCT CTA1TCATCT 
TCTTGCTGGT CTCCAGACAC ATTCCTGTTG CATTAAGACT IXSAAAGACIT GTAGATGTGT 
GATGTTCAGG CACAGGATGC TGAAAGCTAT GTTACTAITC TTAGTITCTA AATTCTCCTT 
TTGATACCAT CATCTTGTTT TCTTTTTGTA GGTATAAATA AAAACACTGT TGACAATAAA 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAA 



40 



(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1351 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 
CTTCACAGAC TGACAGAATG GTriTGTrTT GTTTICTTTT GTmCTTTr GnTTTCAGA 
55 TGGACTCTAG CTCTGTCACC CAGGCTGGAG TGCAGTGGTG CGATCTCGGC TCACTGCAAG 
CTCCGCCTCC CGGGTTCTCA CCATTCTCCT GCCTCAGCCT CCCGAGTAGC 'TCGGACTACA 



2218 



60 
120 
180 



60 



GGCGCCCACC ACCACGCCCG GCTAATTTTT TGTATTTTTT AGTAGAGACG GGGTTTCACC 240 
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50 



55 



(2) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERICTICS : 

(A) LENGTH: 2066 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDWESS : do\ible 

(D) TOPOLOGY: linear 



300 



355 



ATGTTAGCCA GGATGGTCTC GATCTCCTGA CCTCGTGATC CGCCCGCYTC GGCCTCCCAA 

AGTGCTGGGA TTACAGGCGT GAGCCACCGT GCCTGCCCCA GAATGGITTT TAAAGCCACA 360 

GTTGAGARGC CACCCATTGC CCGGCGCCTG GACAGTGATC ATCTTGTTCA TCTTOTTCAG 420 

TCCTTTCTTG TGTGATTOGA ATTATTCATC CCCTTTGAAA GATGAGAAGG TTGAGATCCA 480 

AAGAGTCTAC CTTTCCAAGT TCTCACTGCT GGAAAGARCT AGAAGCACAG TTCAAAGTTC' 540 

TGGNTTCTGG ACTCTGCAGT CCAGGTYTCC CTTVTCCCAC TTGCCTACCC TCAATGCCAC 600 

ACTGTTTTTG AAGTGGCCCA TAACTTGAAG GRAAAGTTTA AAGACAGTTC AAriTAATCA 660 

15 TCAGRATGCA TTCTTTTTTT TTTCGGARAC GGAKTTTCAC TCTTGCTCCC CASGCTGGAG 720 

TGCAATGGTG CAATGATCTC GGCTCACTGC AACCTATGCC TCCTGGGTTC AAGNGATTAT 780 

CCAGCCTCAG CCTCCCGAGT AGCTGGGATT ATQGGCGCCC ACCACCA'TCC CCAGCTAArr 840 

TTTGTATTTT TTTTTTTAGT AGAGATGGGG TTTCGCCAGG TTGGCCAGGC TCKTCTTCTC 900 

AAYTCCTGGC YTCAGGTGAT YTGCCCACYT CATCYTCCAA AAGTCCTCGG ATTf^CPJ3GCA 960 

25 TGAGCCACTG CGCCTGGCYT CAGAATGCAT TCTTACACAT CTATCCTAGA CATTTATAAG 1020 

CACTCTAATG GATAACAATC CAAGAATAAA TGATTGTAAA AGATGATGCC GAAGAGTTGA 1080 

TGTCAATCTT TTTTTCCTAA GAAAAAAAGT CCGCGAGTAT TAAATATTTA GATCAATCTT 1140 

TATAAAATGA TTACTTTGTA TATCTCATTA TTCCTATTTT GGAATAAAAA CTGACCTTCT 1200 

TTAATCATAT ACTTGTCTTT TGTAAATAGC AGCTTTTGTG TCATTCTCCC CACTTTATTA 1260 

35 GTTAATTTAA ATTGGAAAAA ACCCTCAAAC TAATATTCTT GTCTGTTCCA GTCTTATAAA 1320 
TAAAACTTAT AATGCATGTA AAAAAAAAAA A 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

GGCACGAGGC GGCGGAGGGC CACAATCACA GCTCCGGGCA TTGGGGGAAC CCGAGCCGGC 60 

TGCGCCGGGG GAATCCGTGC GGGCGCCTTC CGTCCCGGTC CCATCCTCGC CGCGCTCCAG 120 

CACCTCTGAA GTTTTGCAGC GCCCAGAAAG GAGGCGAGGA AGGAGQGAGT GTGTGAGAGG 180 

AGGGAGCAAA AAGCTCACCC TAAAACATTT ATTTCAAGGA GAAAAGAAAA AGGGGGGGCG 240 

60 CAAAAATGGC TGGGGCAATT ATAGAAAACA TGAGCACCAA GAAGCTGTX3C ATTGTTGGTC 300 
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20 
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GGATTCTGCT CGTGTTCCAA ATCATCGCCT TTCTGGTC5GG AGGCTTGATT GCTCCAGGGC 360 

^ CCACAACGGC AGTGTCCTAC ATGTCGGTGA AATGTGTGGA TGCCCGTAAG AACCATCACA 420 

AGACAAAATG GTTCGTGCCT TGGGGACCCA ATCATTGTGA CAAGATCCGA GACATTCAAG 480 

AGGCAATTCC AAGGGAAATT GAAGCCAATG ACATCGTGTT TTCTGTTCAC ATTCCCCTCC 540 

10 CCCACATGGA GATGAGTCCT TGGTTCCAAT TCATGCTGTT TATCCTGCAG CTGGACATTC 600 

CCTTCAAGCT AAACAACCAA ATCAGAGAAA ATGCAGAAGT CTCCATGGAC GTTTCCCTGG 660 

CTTACCGTGA TGACGCATTT GCTGAGTGGA CTGAAATGGC CCATGAAAGA GTACCACGGA 720 

AACTCAAATG CACCTTCACA TCTCCCAAGA CTCCAGAGCA TGAGGGCCGT TACTATGAAT 780 

GTGATGTCCT TCCTTTCATG GAAATTGGGT CTGTGGCCCA TAAGTTTTAC CTTTTAAACA 840 

TCCGGCTGCC TGTGAATGAG AAGAAGAAAA TCAATGTGGG AATTGGGGAG ATAAAGGATA 900 

TCCGGTTGGT GGGGATCCAC CAAAATGGAG GCTTCACCAA GGTGTGGTTT GCCATGAAGA 960 

CCTTCCTTAC GCCCAGCATC TTCATCATTA TGGTGTGGTA TTGGAGGAGG ATCACCATGA 1020 

TGTCCCGACC CCCAGTGCTT CTGGAAAAAG TCATCTTTGC CCTTGGGATT TCCATGACCT 1080 

TTATCAATAT CCCAGTGGAA TGGTTTTCCA TCGGGTTTGA CTGGACCTGG ATGCTGCTGT 1140 

30 TTGGTGACAT CCGACAGGGC ATCTTCTATG CGATGCTTCT GTCCTTCTGG ATCATCTTCT 1200 

GTGGCGAGCA CATGATGGAT CAGCACGAGC GGAACCACAT TGCAGGGTAT TCGAAGCAAG 1260 

TCGGACCCAT TGCCGTTGGC TCCTTCTGCC TCTTCATATT TGACATGTGT GAGAGAGGGG 1320 

TACAACTCAC GAATCCCTTC TACAGTATCT GGACTACAGA CATTGGAACA GAGCTGGCCA 1380 

TGGCCTTCAT CATCGTGGCT GGAATCTGCC TCTGCCTCTA CTTCCTGTTT CTATGCTTCA 1440 

40 TGGTATTTCA GGTGTTTCGG AACATCAGTG GGAAGCAGTC CAGCCTGCCA GCTATGAGCA 1500 

AAGTCCGGCG GCTACACTAT GAGGGGCTAA TTTTTAGGTT CAAGTTCCTC ATGCTTATCA 1560 

CCTTGGCXrrG CGCTGCCATG ACTGTCATCT TCTTCATCGT TAGTCAGGTA ACGGAAGGCC 1620 

45 

ATTGGAAATG GGGCGGCGTC ACAGTCCAAG TGAACAGTGC CTTTTTCACA GGCATCTATG 1680 

GGATGTQGAA TCTGTATGTC TTTGCTCTGA TGrTCTTGTA TGCACCATCC CATAAAAACT 1740 

50 ATGGAGAAGA CCAGTCCAAT GGAATGCAAC TCCCATGTAA ATCGAGGGAA GATTGTGCTT 1800 

TGTTTGTTTC GGAACTTTAT CAAGAATTGT TCAGCGCTTC GAAATATTCC TTCATCAATG 1860 

ACAACGCAGC TTCTGGTATT TGAGTCAACA AGGCAACACA TGTTTATCAG CTTTGCATTT 1920 

GCAGTTGTCA CAGTCACATT GATTGTACTT GTATACGCAC ACAAATACAC TCATTTAGCC 1980 

TTTATCTCAA AATGTTAAAT ATAAGGAAAA AAGCGTCAAC AATAAATATT CTTGAGTATA 2040 

60 AAAAAAAAAA AAAAAAAAAA AAAAAA 2066 
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(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1705 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 
AATTCGGCAK AGGGCAGCTG TCGGCTGGAA GGAACTGGTC TGCTCACACT TGCTGGCTTG 
CGCATCAGGA CTCGCTTTAT CTCCTGACTC ACGGTGCAAA GGTGCACTCT GCGAACGTTA 
AGTCCGICCC CAGCGCTTGG AATCCTACGG CCCCCACAGC CGGATCCCCT CAGCCTTCCA 
GGTCCTCAAC TCCCGYGGAC GCTGAACAAT GGCCTCCATG GGGCTACAGG TAATGGGCAT 
CGCGCTGGCC GTCCTCGGCT GGCTGGCCGT CATGCTGTGC TGCGCGCTGC CCATGTGGCG 
CGTGACGGCC TTCATCGGCA GCAACATTGT CACCTCGCAG ACCATCTGGG AGGGCCTATG 
GATGAACTGC GTGGTGCAGA GCACCGGCCA GATGCAGTGC AAGGTGTACG ACTCGCTGCT 
GGCACTGCCG CAGGACCTGC AGGCGGCCCG CGCCCTCGTC ATCATCAGCA TCATCGTGGC 
TGCTCTGGGC GTGCTGCTGT CCGTGGTGGG GGGCAAGTGT ACCAACTGCC TGGAGGATGA 
AAGCGCCAAG GCCAAGACCA TCATCGTGGC GGGCGTGGTG TTCCTGTTGG CCGGCCTTAT 
GGTGATAGTG CCGGTGTCCT GGACGGCCCA CAACATCATC CAAGACTTCT ACAATCCGCT 
GGTGGCCTCC GGGCAGAAGC GGGi^TGGG TGCCTCGCTC TACGTCGGCT GGGCCGCCTC 
CGGNCTCCTG CTCCTTGGCG GGGGGCTGCT TTGCTGCAAC TGTCCACCCC GCACAGACAA 
GCCTTACTCC GCCAAGTATT CTGCTGCCCG CTCTGCTGCT GCCAGCAACT ACGTGTAAGG 
TGCCACGGCT CCACTOTCTT CCTCTCTGCT TTGTTCITCC CTGGACTGAG CTCAGCGCAG 
GCTGTGACCC CAGGAGGGCC CTGCCACGGG CCACTGGCTG CTGGGGACTG GGGACTGGGC 
AGAGACTGAG CCAGGCAGGA AGGCAGCAGC CTTCAGCCTC TCTGGCCCAC TCGGACAACT 
TCCCAAGGCC GCCTCCTGCT AGCAAGAACA GAGTCCACCC TCCTCTGGAT ATTGGGGAGG 
GACGGAAGTG ACAGGGTGTG GTGGTGGAGT GGGGAGCTGG CTTCTGCTGG CCAGGATGGC 
TTAACCCTGA CTTTCGGATC TGCCTGCATC GGTGTTGGCC ACTGTCCCCA TTTACATTTT 
CCCCACTCTC TCrcCCTGCA TCTCCTCTGT TGCGGGTAGG CCTTGATATC ACCTCTGGGA 
CTCTGCCTTG CTCACCGAAA CCCQCGCCCA GGAGTATGGC TGAGGCCTTG CCCACCCACC 
TGCCTO3GAA GICCAGAGTG GATGGACGGG TTTAGAGGGG AGGGGCGAAG GTGCTGTAAA 
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CAGGTTTGGG CAGTGGTGGG GGAGGGGGCC AGAGAGGCGG CTCAGGTTGC CCAGCTCTGT 
GGCCTCAGGA CTCTCTGCCT CACCCGCTTC AGCCCAGGGC CCCTGGAGAC 1X3ATCCCCTC 
TGAGTCCTCT GCCCCTTCCA AGGACACTAA TGAGCCTGGG AGGGTGGCAG GGAGGAGGGG 
ACAGCTTCAC CCTTGGAAGT CCTGGGGTTT TTCCTCTTCC TTCTTTGTGG TTTCTGTTTT 
GTAATTTAAG AAGAGCTATT CATCACTGTA ATTATTATTA TTTTCTACAA TAAATGGGAC 
CTGTGCACAG GRAAAAAAAA AAAAG 



1440 
1500 
1560 
1620 
1580 
170S 



15 



20 



25 



30 



35 



40 
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50 
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60 



(2) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1167 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

TGCAGGAATT CGGCAGAGGT TTTCCGCTAG ACTCTGGCAG TTGGTGAGCA TCATGGCAAC 60 

CGTTACAGCC ACAACCAAAG TCCCGGAGAT CCGTGATGTA ACAAGGATTG AGCGAATCGG 120 

TGCCCACTCC CACATCCGGG GACTGGGGCT GGACGATGCC TTGGAGCCTC GGCAGGCTTC 180 

GCAAGGCATG GTGGGTCAGC TGGCGGCACG GCGGGCGGCT GGCGTGGTGC TGGAGATOAT 240 

CCGGGAAGGG AAGAITGCCG GTCGGGCAGT CCTTATTGCT GGCCAGCCGG GCACGGGGAA 300 

GACGGCCATC GCCATGGGCA TGGCGCAGGC CCTGGGCCCT GACACGCCAT TCACAGCCAT 360 

CGCCGGCAGT GAAATCTTCT CCCTCGAGAT GAGCAAGACC GAGGCGCTGA CGCAGGCCTT 420 

CCGGCGGTCC ATCGGCGTTC GCATCAAGGA GGAGACGGAG ATCATCGAAG GGGAGGTCGT 480 

GGAGATCCAG ATTGATCGAC CAGCAACAGG GACGGGCTCC AAGGTGGGCA AACTGACCCT 540 

CAAGACCACA GAGATGGAGA CCATCTACGA CCTGGGCACC AAGATGATTG AKTCCCTGAC 600 

CAAGGACAAG GTCCAGGCCG GGGACGTGAT CACCATCGAC AAGGCGACGG GCAAGATCTC 660 

CAAGCTGGGC CGCTCCTTCA CACGCGCCCG CGAACTACGA CGCTATGGGC TCCCAGACCA 720 

AGTTCGTGCA GTGCCCAGAT GGGGAGCTCC AGAAACGCAA GGAGGTGGTG CACACCGTGT 780 

CCCTGCACGA GATCGACGTC ATCAACTCTC GCACCCAGGG CTTCCTGGCG CTCTTCTCAG 840 

GTGACACAGG GGAGATCAAG TCAGAAGTCC GTGAGCAGAT CAATGCCAAG GTGGCTGAGT 90O 

GGCGCGAGGA GGGCAAGGCG GAGATCATCC CTGGAGTGCT GTTCATCGAC GAGGTCCACA 960 

TGCTGGACAT CGAGAGCTTC TCCTTCCTCA ACCGGGCCCT GGAGAGTGAC ATGGCGCCTG 1020 

TCCAGCAGGT CTATGGGGAT GCCGTQAGGG CTCTGGTAGC TGGTGCCCCG GATTCGCGTG 1080 



wo 98/54963 



359 



PCT/US98/11422 



10 



20 



40 



ATGCCACGGT TGGTCGCCTC GTGCCGAATT CCTGCAGCCC GGGGGATCC^ CTAGTTCTAG 1140 
AGCGGCCGCC ACCGCGGTGG ANCTCCN 1167 

(2) INFORMATION FOR SEQ ID NO; 108: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1907 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
15 (D) TOPOLOGY: linear 



50 



(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 108: 

GGCACAGGGG AATCATCGTG TGATGTGTGT GCTGCCTTTG TGAGTGTGTG GAGTCCTGCT 60 

CAGGTCTTAG GTACAGTGT3 TTTGATCGTG GTGGCTTGAG GGGAACCCTT GTTCAGAGCT 120 

GTGACTGCGG CTCCACTCAG AGAAGCTGCC CTTGGCTGCT CGTAGCGCCG GGCCTTCTCT 180 

25 CCTCGTCATC ATCCAGAGCA GCCAGTGTCC GGGAGGCAGA AGGTACCGGG GCAGCTACTG 240 

GAGGACTGTG CGGGCCTGCC TGGGCTGCCC CCTCCGCCGT GGGGCCCTGT TGCTGCTGTC 300 

CATCTATTTC TACTACTCCC TCCCAAATGC GGTCGGCCCG CCCTTCACTT GGATGCITGC 360 

30 

CCTCCTGGGC CTCTCGCAGG CACTGAACAT CCTCCTGGGC CTCAAGGGCC TGGCCCCAGC 420 

TGAGATCTCT GCAGTGTGTG AAAAAGGGAA TTTCAACGTG GCCCATGGGC TGGCATGGTC 480 

35 ATATTACATC GGATATCTGC GGCTGATCCT GCCAfiAGCTC CAGGCCCGGA TTCGAACTTA 540 

CAATCAGCAT TACAACAACC TGCTACGGGG TGCAGTGAGC CAGCGGCTGT ATATTCTCCT 600 

CCCAITGGAC TGTGGGGTGC CTGATAACCT GAGTATGGCT GACCCCAACA TTCGCTTCCT 660 

GGATAAACTG CCCCAGCAGA CCGGTGACCG TGCTGGCATC AAGGATCGGG TTTACAGCAA 720 

CAGCATCTAT GAGCTTCTCG AGAACGGGCA GCGGGCGGGC ACCTGTGTCC TGGAGTACGC 780 

45 CACCCCCTTG CAGACTTTGT TTGCCATGTC ACAATACAGT CAAGCTGGCT TTAGCGGGGA 840 

GGATAGGCTT GAGCAGGCCA AACTCTTCTG CCGGACACTT GAGGACATCC TGGCAGATGC 900 

CCCTCAGTCT CAGAACAACT GCCGCCTCAT TGCCTACCAG GAACCTGCAG ATGACAGCAG 960 

CTTCTCGCTC TCCCAGGAGG TTCTCCGGCA CCTGCGGCAG GAGGAAAAQG AAGAGGTTAC 1020 

TCTGGGCAGC TTGAAGACCT CAGCGGTGCC CAGTACCTCC ACGATGTCCC AAGAGCCTGA 1080 

55 GCTCCTCATC AGTGGAATGG AAAAGCCCCT CCCTCTCCGC ACGGATTTCT CTTGAGACCC 1140 

AGGCTCACCA GGCCAGAGCC TCCAGTGGTC TCCAAGCCTC TGGACTGGGG GCTCTCTTCA 1200 

GTGGCTGAAT GTCCAGCAGA GCTATTTCCT TCCACAGGGG GCCTTGCAGG GAAGGGTCCA 1260 

60 
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GGACTTGACA TCTTAAGATG CGTCTTGTCC CCTTGGGCCA GTCATTTCCC CTCTCTGAGC 1320 

CTCGGTGTCT TCAACCTGTG AAATGGGATC ATAATCACTG CCTTACCTCC CTCACGGTTC 1380 

TTGTGAGGAC TGAGTGTGTG GAAGTTTTTC ATAAACTTTG GATGCTAGTG TACTTAGGGG 1440 

GTGTGCCAGG TGTCTTTCAT GGGGCCTTCC AGACCCACTC CCCACCCTTC TCCCCTTCCT 1500 

TTGCCCGGGG ACGCCGAACT CTCTCAATGG TATCAACAGG CTCCTTCGCC CTCTGGCTCC 1560 

TGGTCATGTT CCATTATTGG GGAfiCCCCAG CAGAAGAATG GAGAGGAGGA GGAGGCTGAG 1620 

TTTGGGGTAT TGAATCCCCC GGCTCCCACC CTGCAGCATC AAGGTTGCTA TGGACTCTCC 1680 

TGCCGGGCAA CTCTTGCGTA ATCATGACTA TCTCTAGGAT TCTGGCACCA CTTCCTTCCC 1740 

TGGCCCCTTA AGCCTAGCTG TGTATCGGCA CCCCCACCCC ACTAGAGTAC TCCCTCTCAC 1800 

TTGCGGTTTC CTTATACTCC ACCCCTTTCT CAACGGTCCT TTTTTAAAGC ACATCTCAGA 1860 

TTAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAGGG CGGCCGC 1907 
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(2) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 611 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOU)GY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

ATGAATTAAC GCCAAGCTNT NAATAGGGAC TCACTATGGG GGAAAGNTGG GTAACGCCTG 60 

CAGGTACCGT TCCGGAATTC CCGOGTCGAC CCACGCGTCC GATGGGGCTT TAGTAAATCA 120 

GGCTTGCAGG CTCAAAGCTG CAATCTGCCC ACTCTCAGGT ACTGAGACTT TGTGGGCCTC 180 

AGACACCAGG AAGAAAGTTG GGATACAGTC ATTTGAGTTA AAAAGGGAAT GACCCCTCAG 240 

AAACCCGCAT TAGCAGTGTT ACTCTTGGAA GTGCCTTTAC TTTTAACGCT CTCTCrCTCTG 300 

AAAAAGAGGT GTTTGGTTAC GTGTGAGCCA ACATCACGTT TTGTTAGCTG TGATTTACCT 360 

TTGTCCGTTT AAAAGACTTC ACGGAGCCAT TCTGTATACA AGGTGTGCTC TTTCCAATGT 420 

AGAAGGGGTT ATGGAAAAGG GTGCGATCCT TTGCTGTAAA CTQGAGAGAC CAGTCCCAAA 480 

CAGAGGGGAA TTTTAAGCCC TTCTCATCAC CCAATTGGAT GTTTTTGCTT ATAGCAAATT 540 

CCTGCAAAAT AAATAAATAA ATATTTGCAA AACTAAAAAA AAAAAAAAAA AAAAAAAAAA 600 

GGGGGOrcCN C 611 
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(2) INFORMATION FOR SEQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2632 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 
TCCCAGCTCT CAGGACAAGG GCCCTGGGCG ATCTTTTAAA AAAGCCGATT GGGTGTCTTT 60 

CTAAAANTAC AACCAGTACT TCATCGTCAA GTTTCTGGGA AGGGAGTCCC CTCCAGATTC 120 

15 TCATGGAGTG ACAAATCTTG ACTCTTGCTC CTGGAATTTT TCAGGCCCAA ACTAGCGTTT 180 

CTACAATCAT TTATTTGGCA AATTTGTCTT GATTATGGGT GGCTGATGAG GAACGTGCTT 240 

TTOTTAGGAA CCGAAACTCG GCGGCGGTGA GGGCGTGTAC GCAATGAGTC CGGAAGAGGG 300 

TGAAATGCTT TCGGTAGGCA CTCCACGGCT GTGAAGATGG CGGCGGCTGC GTGGCTTCAG 360 

GTGTIGCCTG TCATTCTTCT GCTTCTGGGA GCTCACCCGT CACCACTGTC GTTTTTCAGT 420 

25 GCGGGACCGG CAACCGTAGC TGCTGCCGAC CGGTCCAAAT GGCACATTCC GATACCGTCG 480 

GGGAAAAATT ATnTAGTTT TGGAAAGATC CTCTTCAGAA ATACCACTAT CTTCCTGAAG 540 

TTTCATGGAG AACCITGTGA CCTGTCTTTG AATATAACCT GGTATCTGAA AAGCGCTGAT 600 

TGTTACAATG AAATCTATAA CTTCAAGGCA GAAGAAGTAG AGTTGTATTT GGAAAAACTT 660 

AAGGAAAAAA GAGGCTTGTC TGGGAAATAT CAAACATCAT CAAAATTGTT CCAGAACTGC 720 

35 AGTGAACTCT TTAAAACACA GACCTTTTCT GGAGATTTTA TGCATCGACT GCCTCnTTA 780 
GGAGAAAAAC AGGAGGCTAA GGAGAATGGA ACAAACCTTA CCTTTATTGG AGACAAAACC 840 
GCAATGCATG AACCATTCCA AACTTGGCAA GATGCACCAT ACATTTTTAT TGTACATATT 
GGCATTTCAT CCTCAAAGGA ATCATCAAAA GAAAATTCAC TGAGTAATCT TTTTACCATG 

ACTGTTGAAG TGAAGGGTCC CTATGAATAC CTCACACTTG AAGACTATCC CTTGATGATT 1020 

45 rmrCATGG TGATOTOTAT TGTATATGTC CTGTTTGGTG TTCTGTGGCT GGCATGGTCT 1080 

GCCTGCTACT GGAGAGATCT CCTGAGAATT CAGTTTTGGA TTGGTGCTGT CATCTTCCTG 1140 

GGAATCCTTG AGAAAGCTCT CTTCTATGCG GAATTTCAGA ATATCCGATA CAAAGGARAA 1200 

TCTCTCCAGG GrrGCTTTCAT CCTTGCAGAR CTGCmCAG CAGTGAAACG CTCACTGGCT 1260 

CGAACCCTCG TCATCATAGT CAGTCTGQGA TATGGCATCG TCAAGCCACG CCTGGAGTCA 1320 

55 CTCTTCATAA GGTTGTAGTA GCAGRAGCCC TCTATCTTTT GTTCTCTGGC ATGGAAGGGG 1380 

TCCTCAGAGT TACTGGGGCC CAGACTGATC TTGCTTCCTT GGCCTTTATC CCCTTGGCTT 1440 

TCCTAGACAC TGCCTTGTGC TGCrrCGATAT TTATTAGCCT GACTCAAACA ATGAAGCTAT 1500 

60 
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TAAAACTTCG GAGGAACATT GTAAAACTCT CTTTGTATCG GCATTTCACC AACACGCTTA 1560 

TTTTGGCAGT GGCAGCATCC ATTGTGTTTA TCATCTGGAC AACCATGAAG TTCAGAATAG 1620 

TGACATGTCA GTCGGACTGG CGGGAGCTGT GGGTAGACGA TGCCATCTGG CGCTTGCTGT 1680 

TCTCCATGAT CCTCTTTGTC ATCATGGTTC TCTGGCGACC ATCTGCAAAC AACCAGAGGT 1740 
TTGCCTTTTC ACCATTGTCT GAGGAAGAGG AGGAGGATGA ACAAAAGGAG CCTATGCTGA " 1800 

AAGAAAGCTT TGAAGGAATG AAAATGAGAA GTACCAAACA AGAACCCAAT GGAAATAGTA 1860 

AAGTTAACAA AGCACAGGAA GATGATTTGA AGTGGGTAGA AGAGAATGTT CCTTCTTCTC 1920 

15 TGACAGATGT AGCACTTCCA GCCCTTCTGG ATTCAGATGA GGAACGAATG ATCACACACT 1980 

TTGAAAGGTC CAAAATGGAG TAAGGAATGG GAAGATTTGC AGTTAAAGAT GGCTACCATC 2040 

AGGGAAGAGA TCAGCATCTG TGTCAGTCTT CTGTACGGCT CCATGGGATT AAAGGAAGCA 2100 

20 

ATGACATCCT GATCTGTTCC TTGATCTTTG GGCATTGGAG TTGGCGAGAG GTGTCAGAAC 2160 

AAAGAGAACA TCTTACTGAA AACAACTTTCA TAAGATGAGA AAAATCTACG AGCTTCTTAT 2220 

25 TTACAACACT GCTGCCCCCT TTCCTCCCAG ACTCTGACAT GGATGTTCAT GCAACTTAAG 2280 

TGTGTTGTTC CTGAACTTTC TGTAATGTTT CATTTTTTAA ATCTGACAAA CTAAAAAGTT 2340 

TAACGTCTTC TAAAAGATTG TCATCAACAC CATAATATGT AATCTCCAGG AGCAACTGCC 2400 

30 

TGTAATnTT ATTTAlTrAG GGAGTTACAT AGGTGATGGG GGAAATTGTT AACTACCTTT 2460 

CATTTTCCTG GGAAGTCAAG GTTACATCTT GCAGAQGTTG TTTTGAGAAA AAAGGGCCCT 2520 

35 TCTGAGTTAA GGAGCCATAG TTCTATCAAT GATCAAAAGA AAAAAAAAAA AACTCGATCG 2580 

GCACGAGGGG GQGCCCGGTA CCCAATTCGC CCTATGGGAN TCGAATGAGA CC 2632 

40 

(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 2249 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGV: linear 

50 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 

GAATTCGGCA CGAGCTCACC GTGCTGCGTG ACACAAGGCC AGCCTGCGCC TACGAGCCCA 60 

TGGACTTTKT RATGGCCCTC ATCTACGACA TGGTACTGSW TGTGGTCACC CTGQGGCTGG 120 

CCCTCTTCAC TCTGTGCGGC AAGTTCAAGA GGTGGAAGCT GAACGGGGCC TTCCTCCTCA 180 

TCACAGCCTT CCTCTCTGTG CTCATCTGGG TGGCCTGGAT GACCATGTAC CTCTTCGGCA 240 

60 ATGTCAAGCT GCAGCAGGGG GATGCCTGGA ACGACCCCAC CTTQGCCATC ACGCTGGCGG 300 
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CCAGCGCTGG GTCTTCGTCA TCTTCCACGC CATCCCTGAG ATCCACTGCA CCCTTCTGCC 360 

AGCCCTGCAG GAGAACACGC CCAACTACTT CGACACGTCG CAGCCCAGGA TGCGGGAGAC 420 

GGCCTTCGAG GAGGACGTGC AGCTGCCGCG GGCCTATATG GAGAACAAGG CCTTCTCCAT 480 

GGATGAACAC AATCCAGCTC TCCGAACAGC AGGATTTCCC AACGGCAGCT TGGGAAAAAG 540 

ACCCAGTCGC AGCTTGGGGA AAAGACCCAG CGCTCCGTTT AGAAGCAACG TGTATCAGCC 600 

AACTCAGATO GCCGTCGTGC TCAACGGTGG GACCATCCCA ACTGCTCCGC CAAGTCACAC 660 

AGGAAGAMAC CTTTGGTGAA AGACTTTAAG TTCCAGAGAA TCAGAATTTC TCTTACCGAT 720 

1TCCCTCCCT GGCTGTGTCT TTCTTGAGGG AGAAATCGGT AACAGTTGCC GAACCAGGCC 780 

GCCTCACAGC CAQGAAAITT GGAAATCCTA GCCAAGGGGA TTTCGTGTAA ATGTGAACAC 840 

TGACGAACTG AAAAGCTAAC ACCGACTGCC CGCCCCTCCC CTGCCACACA CACAGACACG 900 

TAATACCAGA CCAACCTCAA TCCCCGCAAA CTAAAGCAAA GCTAATTGCA AATAGTATTA 960 

GGCTCACTGG AAAATGTCGC TGGGAAGACT GTTTCATCCT CTGGGGGTAG AACAGAACCA 1020 

AATTCACAGC TGCTCGCCCA GACTGGTGTT GGTTGGAGGT GGGGGGCTCC CACTCTTATC 1080 

ACCTCTCCCC AGCAAGTCCT GGACCCCAGG TAGCCTCTTG GAGATGACCG TTGCGTTGAG 1140 

GACAAATGGG GACTTTGCCA CCGGCTTTGC CTGGTGGTTT GCACATTTCA GGGGGGTCAG 1200 

GAGAGTTAAG GAGGTTCTGG GTGGGATTCC AAGGTGAGGC CCAACTGAAT CGTGGGGTTGA 1260 

GCTTTATAGC CAGTAGAGGT GGAGGGACCC TGGCATGTGC CAAAGAAGAG GCCCTCTGGG 1320 

TGATGAAGTG ACCATCACAT TTGGAAAGTG ATCAACCACT GTTCCTTCTA TGGGGCTCTT 1380 

GCTCTAGTCT CTATCGTGAG AACACAGGCC CCGCCCCTTC CCTTGTAGAG CCATAGAAAT 1440 

ATTCTCGCTT GGGGCAGCAG TCCCTTCTTC CCTTGATCAT CTCGCCCTGT TCCTACACTT 1500 

ACGGGICTAT CTCCAAATCC TCTCCCAATT TTATTCCCTT ATTCATTTCA AGAGCTCCAA 1560 

TGGGC?rCTCC AGCTCAAANS CCCTCCGGGA GGCAGGTTGG AAGGCAGGCA CCACGGCAGG 1620 

TITTCCGCGA TGATCTCACC TAGCAGGGCT TCAGGGGTTC CCACTAGGAT GCAGAGATGA 1680 

CCTCTCGCTG CCTCACAAGC AGTGACACCT CGGGTCCTTT CCGTTGCTAT GGTGAAAATT 1740 

CCTGGATGGA ATGGATCACA TGAGGGTTTC TTGTTGCTTT TGGAGGGTGT GGGGGATATT 1800 

TTCTTTIGGT TTITCTGCAG GTTCCATGAA AACAGCCCTT TTCCAAGCCC ATTGTTTCTG 1860 

TCATGGTITC CATCTGTCCT GAGCAAGTCA TTCCTTTGrT ATTTAGCATT TCGAACATCT 1920 

CGGCCATTCA AAGCCCCCAT GTTCTCTGCA CTGTTTGGCC AGCATAACCT CTAGCATCGA 1980 

•nCAAAGCAG ACTTTTAACC TGACGGCATG GAATGTATAA ATGAGGGTGG GTCCTTCTGC 2040 

AGATACTCTA ATCACTACAT TGCTTTTTCT ATAAAACTAC CCATAAGCCT TTAACCTTTA 2100 
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AAGAAAAATG AAAAAGGTTA CrTTTTTC-GGG C-CCGGGGGAG GACTGACCGC TTCATAAGCC 2160 
ACTTACGTCTG AGCIGAGTAT GC^CAATAA .-JCCTITTGAT ATTTCTCAAA AAAAAAAAAA 2220 
AAAAAMCCCG GGGGGGGC-CC C^3A::cT^ 2249 



(2) INFC^<MATICM ?C?. SEQ II I'd 112: 



(i) SEQUENCE C-:A?ArrE?i:37IC3 : 

(A) lEJiGIH: 2193 b^e pairs 
15 (3) IY?£: r.-cleic acid 

(C) STR^JXEItlESS; dc'jjale 

(D) -0PCLO3:-: lir.ear 



(Xi) SEQUENCE EESC?J?TICN: SSQ ID NO: 112: 
GATACTATAA C-GCAAGTG.=.C TCACGGGTGC GCCGTTAGAC TAGTGGATCC CGGGTGCAGG 
AATTCGGC^G AGCGCCC-CCG GAGCCGAAGT C-CTGGCGCCC CCGCGGCCGC TGCCTCCGCG 
25 GANCCCAAAA TCATGAAAZST CACCGTGAAG ACCCCGAAGA AAAGGAGGAA TTCGCCGTGC 
CCGAGAATA,G CTCCGTCCAG 'IAJ:=TTTAAC-G AAGAAATCTC TAAACGTriT AAATCACATA 
CTGACCAACT TGrCTTTGArA TTTGCrGGAA AAATTTTGAA AGATCAAGAT ACCTTGAGTC 
AGCATGGAAT TC\7GATG2A CmACTGTTC ACCTTGTCAT TAAAACACAA AACAGGCCTC 



60 



TTCAACACAC CAGGAATSCA GAGCTTGTTG CAACAAATAA CTGAAAACCC ACAACTTATC 



60 
120 
180 
240 
300 
360 



AGGATCATTC AC-CTCAGC-.-. ACA;-ATACAG CTGGAAGCAA TGTTACTACA TCATCAACTC 420 

35 CTAATAGTAA C?CTACATCT GGTTCTCCTA CTAGCAACCC TTTTGGTTTA GGTGGCCTTC 480 

GGGGACTTGC AGGTCTGA3T AGCrrGGGTr TGAATACTAC CAACTTCTCT GAACTACAGA 540 

gtcagatgca GCGACAACTT TTGICTAACC CTGAAATGAT GGTCCAGATC ATGGAAAAWC 600 

40 

CCYTTGTTCA GAGCATC-CTC ::?C-AATCCT GACCTGATGN AGACAGTTAA TTATCGCCAA 660 

TCCACAAATG G^jGCAGTTGA TAC-iSAGAAA TCCCAGAAAT TAGTCATATG TTGAATAATC 720 
45 CAGATATAAT GAGACAAACG TTGGAACTTG CCCAGGAATC CAGCAATGAT GCAGGAGATC 



780 



ATGAGGAACC AGGACCa--3C TTTaajGCAAC CTAGAAAGCA TCCCAGGGGG ATATAATGCT 840 

TTAAGGCGCA TGTACACAGA TATICAGGAA CCAATGCTGA GTGCTGCACA AGAGCAGTTT 900 

GGTGGTAATC CATTTGCTTC CTT3GTGA3C AATACATCCT CTGGTGAAGG TAGTCAACCT 960 

TCCCGTACAG AAAATAG-^^A TCCirTACCC AATCCATGGG CTCCACAGAC TTCCCAGAGT 1020 

55 TCATCAGCrr CCVSCGGCAC TGCCAGCACT GTGGGTQGCA CTACIGGTAG TACTGCCAGT 1080 

GGCACTTCTG GGCAGAGIAC TACTGCGCCA AATTTGGTGC CTGGAGTAGG AGCTAGTATG 1140 
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CAAAACATCT TGTCTGCCCC CTACATGAGA AGCATGATGC AGTCACTAAG CCAGAATCCT 
GACCTTGCTG CACAGATGAT GCTGAATAAT CCCCTATTTG CTGGAAATCC TCAGCTTCAA 
GAACAAATGA GACAACAGCT CCCAACTTTC CTCCAACAAA TGCAGAATCC TGATACACTA 
TCAGCAATGT CAAACCCTAG AGCAATGCAG GCCTTGTTAC AGATTCAGCA GGGTTTACAG 
ACATTAGCAA CGGAAGCCCC GGGCCTCATC CCAGGGTTTA CTCCTGGCTT GGGGGCATTA 
GGAAGCACTG GAGGCTCTTC GGGAACTAAT GGATCTAACG CCACACCTAG TGAAAACACA 
AGTCCCACAG CAGGAACCAC TGAACCTGGA CATCAGCAGT TTATTCAGCA GATGCTGCAG 
GCTCTTCCTG GAGTAAATCC TCAGCTACAG AATCCAGAAG TCAGATTTCA GCAACAACTG 
GAACAACTCA GTGCAATGGG ATTTTTGAAC CGTGAAGCAA ACTTGCAAGC TCTAATAGCA 
ACAGGAGGTG ATATCAATGC AGCTATTGAA AGGTTACTGG GCTCCCAGCC ATCATAGCAG 
CATTTCTGTA TCTKGAAAAA ATGTAATTTA TTTTTGATAA CGGCTCTTAA ACTTTAAAAT 
ACCTGCTTTA TTTCATTTTG ACTCTTGGAA TTCTGTGCTG TTATAAACAA ACCCAATATG 
ATCCATTTTA AGGTGGAGTA CAC3TAAGATG TGTGGGTTTT TCTGTATTTT TCTTTTCTGG 
AACAGTGGGA ATTAAGGCTA CTGCATGCAT CACTTCTGCA TTTATTGTAA TTTTTTAAAA 
ACATCACCTT TTATAGTTGG GTGACCAGAT TTTGTCCTGC ATCTGTCCAG TTTATTTGCT 
TTTTAAACAT TAGCCTATGG TAGTAATTTA TGTAGAATAA AAGCATTAAA AAGAAGCAAA 
AAAAAAAAAA AAAAATTCCT GCGCCCGCGA ATTCTTCT 

(2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1043 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 
CTCAAGTGTA TGTGGTGAGG AAGAAGAGGC TCCTACTGTA GACAGCCTTG TTCTACAGAT 
CCTCCCAGAA ATCTCTGGGC CAGGTGGAAC CCAGGGTCAG AGAGGGATGG GAGAGAGGTT 
TAATTTTCCA TGATAAATAA AAATCTATAA AATAATAAAC AAGAGAAAAG AGATTGGAAA 
CAGCCAGGTT GGAGCAGTGA GTGAGTAAGG AAACCTGGCT GCCCTCTCCA GATTCCCCAG 
GCTCTCAGAG AAGATCAGCA GAAAGTCTGC AAGACCCTAA GAACCATCAG CCCTCAGCTG 
CACCTCCTCC CCTCCAAGGA TGACAAAGGC GCTACTCATC TATTTGGTCA GCAGCTTTCT 
TGCCCTAAAT CAGGCCAGCC TCATCAGTCG CTGTGACTTG GCCCAGGTGC TGCAGCTGGA 
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RGACTTQGAT GGGTTTGAGG GTTACTCCCT GAGTGACTGG CTGTGCCTGG CTTTTCTCGA 480 

AAGCAAGTTC AACATATCAA AGATWAATGA AAATGCAGAT GGAAGCTTTG ACTATGGSCT 540 

CTTCCAGATC AACAGCCACT ACTGGTGCAA CRATTATAAG AGTTACTCGG AAAACCTTTC 600 

CCACGTAGAC TGTCAAGATC TGCTGAATCC CAACCTTCTT GCAGGCATCC ACTGCGCAAA 660 

10 AAGGATTGTG TCCGGAGCAC GGGGGATGAA CAACTGGGTT AGAATGGAAG KTTGCACTGT 720 

TCAGGCCGGC CACTCTTCTA CTGGCTGACA GGATGCCGCC TGAGATKAAA CARGGTGCGG 780 

GTGCACCGTG GARTCATTCC AAGACTCCTG TCCTCACTCA RGGATTCTTC ATTTCTTCTT 840 

CCTACTGCCT CCACTTCATG TTATTTTCTT CCCTTCCCAT TTACAACTAA AACTGACCAG 900 

AGCCCCAGGA ATAAATGGTT TTCTTGGCTT CCTCCTTACT CCCATCTGGA CCCAGTCXCC 960 

20 TGGTTCCTGT CTGTTATTTG TAAACTGAGG ACCACAATAA AGAAATCTTT ATATTTATCG 1020 

AAAAAAAAAA AAAAAAAACT CGA 1043 

25 

(2) INFORMATION FOR SEQ ID NO: 114: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 703 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 

GAATTCGGCA CGAGTGCGCG GGCACCACGG CGGTTTTTCG ACGCTGGCGG TGGACGCAGG 60 

CAGCATGGAC CACGGTTGCT GGGCGGATGG GGAGCGTCTA TGGTCAGTTG CCTTAGAAGT 120 

40 

GGTGAGATGG GAAGCTGCAG TTGGAAGACC CTGGAGGATG CCTGACAAGG GGATGTCTGA 180 

CACATGATTG GAGCTCTTTT TGAAATGTTT CTTGCCCTTC CTGGAGCAGA GGAGCCATTA 240 

45 TTTATGCAGG TACATCGAAG TCTTTTGACC TCCATACAGT GATTATGCTT GTCATCGCTG 300 

GTGGTATCCT GGCQGCCTTG CTCCTGCTGA TAGTTGTCGT GCTCTGTCTT TACTTCAAAA 360 

TACACAACGC GCTAAAAGCT GCAAAGGAAC CTGAAGCTGT GGCTGTAAAA AATCACAACC 420 

50 

CAGACAAGGT GTGGTGGGCC AAGAACAGCC AGGCCAAAAC CATTGCCACG GAGTCTTGTC 480 

CTGCCCTGCA GTGCTGTGAA GGATATAGAA TGTGTGCCAG TTTTGATTCC CTGCCACCTT 540 

55 GCTGTTGCGA CATAAATGAG GGCCTCTGAG TTAGGAAAGG TGGGCACAAA AATCTTCATG 600 

AGCAATACTT CTTAGTAGAT TGTTTTGTTA TTCAAATCAA GTTCTAGTGT TTTTATGTGA 660 

GATTATATAA TTTACAGTGT TGTTTTATAT ACTTTTGAAT AAA 703 
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(2) INFORMATION FOR SEQ ID NO: 115: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3684 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
10 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115: 

GGCAGAGGGG GCATGAGCAG GAGGAGGATT ACCGCTACGA GGTGCTCACG GCCGAGCAGA 60 

15 

TTCTACAACA CATGGTGGNA ATGTATCCGG GAGGTCAACG AGGTCATCCA GAATCCAGCA 120 

ACTATCACAA GAATACTCCT TAGCCACTTC AATTGGGATA AAGAGAAGCT AATOGAAAGG 180 

20 TACTTTGATG GAAACCTCGA GAAGCTCTTT GCTGAGTGTC ATGTAATTAA TCCAAGTAAA 240 

AAGTCTCGAA CACGCCAGAT GAATACAAGG TCATCAGCAC AGGATATGCC TTGTCAGATC 300 

TGCTACTTCA ACTACCCTAA CTCGTATTTC ACTGGCCTTG AATGTGGACA TAAGTTTTGT 360 

ATV^CAGTCCT GGAOT5AATA TXTAACTACC AAAATAATGG AAGAAGGCAT GGGTCAGACT 420 

ATTTCGTGTC CTGCTCATGG TTGTGATATC TTAGTGGATG ACAACACAGT TATGCGCCTG 480 

30 ATCACAGATT CAAAAGTTAA ATTAAAGTAT CAGCATTTAA TAACAAATAG CTTTGTAGAG 540 

TGCAATCGAC TGTTAAAGTG GTGTCCTGCC CCAGATTGCC ACCATGTTGT TAAAGTCCAA 600 

TATCCTCATC CTAAACCTGT TCGCTGCAAA TGTGGGCGCC AATTTTGCTT TAACTGTGGA 660 

GAAAATTCGC ATGATCCIOT TAAATGTAAG TGGTTAAAGA AATGGATTAA AAAGTGTGAT 720 

GATGACAGTG AAACCTCCAA TTGGATTGCA GCCAACACAA AGGAATGTCC CAAATGCCAT 780 

40 GTCACAATTG AGAAGGATGG TCGTTGTAAT CACATGGTCT GTCGTAACCA GAATTGTAAA 840 
GCAGAGTTTT GCrGGGTOTG TCTTGGCCCA TGGGAACCAC ATGGATCTGC CTGGTACAAC 
TCTAACCGCT ATAATGAGGA TGATGCAAAG GCAGCAAGAG ATGCACAGGA GCGATCTAGG 

GCAGCCCTGC AGAGGTACCT GTTCTACTGT AATCGCTATA TGAACCACAT GCAGAGCCTG 1020 

CGCmGAGC ACAAACTATA TGCTCAGGTG AAACAGAAAA TGGAGGAGAT GCAGCAGCAC 1080 

50 AACATGTCCT GGATTGAGGT GCAGTTCCTG AAGAAGGCAG TTGATGTCCT CTGCCAGTGT 1140 

CGTGCCACAC TCATGTACAC TTATGTCTTC GCTTTCTACC TCAAAAAGAA TAACCAGTCC 1200 

ATTATCnrc AGAATAACCA AGCAGATCTA GAGAATGCCA CAGAGGTGCT CTCQGGCTAC 1260 

CTTGAACGAG ATATTTCCCA AGATTCTCTG CAGGATATAA AGCAGAAAGT ACAAGACAAG 1320 

TACAGATACT GTGAGAGTCG ACGAAGGGTT TTGTrACAGC ATGTGCATGA AGGCTATGAA 1380 

60 AAAGATCTGT GGGAGTACAT TGAGGACTGA GAATGGCCCT GCATAAAATG AACTCTGAAA 1440 
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ACTTTACCAT CTAGAGTGCT CATGCAATTA AAACAAAACA AACACAAACA AGGAGGCACT 1500 

AAGCCTATTC TGACACXACT GGTCTGTAGT ACCAGAATTG TTTTCTTAAT GGAAAGTTTA 1560 

AGTAAATTAT ATTGTAATAA AAAGGTAGAT AAACCATTGT ACAACAGTAT TCTAGGCCGC 1620 

CAACAAAAGT GTGACAGACA CACTAAAAGC CCTCCAACTT TAACTTGTAA CX5TAGCTTCA 1680 

TTCTCAAAGC TGACTCCTTT TTTTTCTTTT TCCTTTTCCT GAGTCTAGTA CAGTTAAAAT 1740 

TTCAAACAGC TCCTTGACAC TGCTTTTCAT GTTCAAACCA GCCATTTTGT TCTACTITCG 1800 

TAAAGGACCT CTTCCCCTTC CTCCCCTACA CATACAGATA CACCCACACA CAGACTCACT 1860 

CTCTTTCTCT CATACCCCAA GGTCATGAGT GAATGATGCT TAGTrCCTTC TAAAGAAAAT 1920 

CTTGGGATGG GGAAAGGGGT AGGCAGCAAG AGGATTCAAC AAACGAAAAA CATAAAAACT 1980 

TTGTATATGA CTTTTAAAAC AAGAGGACAA CACAGTATTT TTCAAAATTG TATATAGCGC 2040 

ATATGCATGG ACAAAGCAAG CGTGGCACGT GTTTGCATAA TGTTTAATTA CAAAAAAATA 2100 

TTTATTCTTT AAAAATCTTC AAGATTATGT CTATTTGCTG TGCATTTTCT TTGAGriTGC 2160 

TTATCTTTCC CGGGTTGGGG TTGGGATAAA GGTGTGTCGG TTTAGCACCT CTGGAAGACC 2220 

TATCTAGAGC TCTTTCACTT TCCTGAGGTT ATTTTGCCCY TTCTGGTCTT GGTATGTCTG 2280 

TTGCCGQCCA TGGGCTNCAY GCCTTGAATT CCTGCTCTTC ATCAGGGACA AGGGAGGTCA 2340 

AGCTCTGACT AATGCCATGA CCTGATTAAG GGGTACAGCA GGGAGTTTTC TTGCTACAGC 2400 

TCATGAATTA ACCTGTCCCA ACCTAATCCC CCTCCATGGC ATCATGCCTC TACCCAAGCC 2460 

TTTGTGTGCC CATGTTATGC ACACAGCTGT AGGCATTCTT AAGTCCCCTG TCGCATCCAG 2520 

TGGAAGCATT TTAAAATTTC TrTTACTTTT TGGTTTTCCC TTAATTGCTG CTTTTCAGAT 2580 

TTTAGTTATG GCTCGTCTGC TCACCCCTTC TCTACATTAG GGTGTCAAAG AGAATCTTTT 2640 

GCTTTAAATA TAAATAGCCA TTCATTTAGT CTCAGATTGT GAATTTAAAA TCGTGGATAC 2700 

CGAAATTGCT TGTGTGTGTT GCTGTGGGTT TGGTTTGAAG GCAAACACCC CTAGAACATC 2760 

ATATTCCCAT CTAGTGCATT TAAATAGAAA TCACTGAGTT TGCTGCTTTT TTATTGTCAG 2820 

CAGATAGGAG AATTAATAAT GCATTTTAGC TGTGATGTCC ATTTTTATGA AATTCCTACT 2880 

AAGAGCTATG TTAAAAGTAA AGGATGGTGG TGGTTGTATT AACTATATAC CTCTTTAGGC 2940 

CATTCTGGCT GTOGTATTTT TCAATAGGTC AGCATCTGTA AATCTGTCAG TTTTATACAG 3000 

GAGTGCAGAG TGAACTAGGC AACTAGATTA AGAGGTCTAA ATATGAAATA CCAGTTGAGG 3060 

CTGAGGACCT CTTCGTCTTC CTTTAAATGT CTTTTCCCTA GGGAGTGTTT ACCATTTOIG 3120 

AGGCAGCTTT GTCTGCTCTT ACACTGTACA TCCTATTACT CCATTGGGAA GTAGGrTCAC 3180 

TTTCCTCTGG CCTTTrGCCT AAGTTAGGCT TTGCTGAATC AACCCTACTT TTCCTTTTAG 3240 
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AAAAGGTTGT TACAGGAGAT TTACTGGCAA CTGTTCTTTT CCCATCAAAA ATCAGTGAAT 
GTTTGCiGAG TATAAATGCT GCTTCCTTAA ACCACTTGTC GCTTTAGGAT CAACTTTACC 
TGTACCTTTT CTCCTTTCCT CCCTTGCCAC CTCAGGTGCA AATCTGAACT CAGTGTCTGC 
TTCTTCCATT TTCTCGTCTC TCTCCCCTCT TCCCCCATTA TCCATATGAC ATTATTTTAC 
TTCAAATGAC AGCATCAATC TTAAAAAGAT ATACATTAAA ACTAAGGAGT TTTTTTAAAG 
AAAGCCTCAA TAAGTTCCTT TCCCTGGTAA CTTTGAAAAG CAGTCAGAGT TGCTATATAG 
ATATATGTGG CTCCTTTAAA ATGCTTTGTG TATGTGTGGT GTTTAAAAAA AAAAAAAAAA 
TTCGGGGGGG GGCCCGGTNC CCAT 

(2) JCNFORMATION FOR SEQ ID NO: 116: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1965 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116: 
AAGAAAGGGT ATTAAAATTC TAGATCACAT ATGGACCCGG GAAGGTTTTT NACCCTCTGT 
TAGTGACATC GAGTCTCCCA CTAGACAAAA TAGGTGGAAA AATCTCTCGA GGGCTCACAT 
TGTTTTGTCA TCTTCAGGAA AAACACCACC AGGCCATACC ACAGCCTGCC CAGTGAGGCG 
GTCTTTCCCA ACAGCACCGG GATGCTGGTG GTGGCCTTTG GGCTGCTGGT GCTCTACATC 
CTTCTCGCTT CATCITCGAA GCGCCCAGAG CCGQGGATCC TGACCGACAG ACAGCCCCTG 
CTGCATGATG GGGAGTGAAG CAGCAGGAAG GGGCTCCCAA GAGCTCCTGG TGGTGCAGCC 
TGTGCTCCCC TCAGAAGCTC TGCTCTTCCC AGGGCTCCCG GCTGGTTTCA GCAGGCGACT 
TTCTTCCAAT GCTGGGCCCA GACTTCTTGC CTGGGTGCTG GCCTGCCCTC TCCGGNCCGC 
TIGCTCCCTG TCTGCTTTCC TTGGTGGYTT TGCTGGGTrGC TGGGCCTGCC CTCTCCGGCC 
GCTTGCTCCC TOTCTGCTTT CCTTGGTGGC TTTGCTGGGT GCTGGGCCTG CCTTCTCTGG 
CTCCTTGCTG CCTGTCTGCT TTCCTTGGTG GCTTTGGCTT CTGCACTCCT TGGCGTCASC 
TCTCAGGTCC TCCATTCACA CGAGGTCCTC CTCGCTCTGG CCGCTCITGC TGCTCCTGTC 
TGAAGAWATC AGACTCATTT CCTCTTAAGA CTCCTAGGGA TGTGGTGAAG AGCTGGGACT 
CAAGTGCAGT CCACGGTOTG AAACATGAGG GARGTGAGGT GTCCGTCCAC TTCCCCCATA 
AAGGTOTCCA TTTCAGTTAG GCTGCCCCGC CACAGAGCAG QCITCATCTG CTCTGCCATC 
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40 (2) INFORMATION FOR SEQ ID NO: 117: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 503 base pairs 

(B) TYPE: nucleic acid 
45 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
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CAGCCCCATC TGGATGTGAG GTGGGGTGGA GACATCATGG GGTGATTGCA GAAAGGGGGA 960 

GTGGCGGCCC ACGCAGCTTC TGCTGAGGAG CTGACCGCTC TGAGCTGTTC TGITTCGTAT 1020 

TGCTGCTCTG TGTCTGCATG TATTGTGACC GTGCGGCTCC ACCTCTTCCA GCTGCTGCTA 1080 

CAGCTGAGGC CTGGATCCCG GCCTTTCCCT GTGACTTACG TGTCTGTCAC CGGCANGCAG 1140 
CCCTACAAAT CCTGGTGACC TGCTCTCCCA AGAACAGAGC CTGTCCCCAG ATGTCCCAGT ' 1200 

AGCGATGAGT AACAGAGGTG GCTGTGGACT TCCTCTACTT CTCCTTGCTC GATCAGGGCC 1260 

TTCCTGCCTC CCGCTGGGCA GGTCTGGCCT TGCTCTCTTG GCAGGGCCCC AGCCCCTCTC 1320 

15 ACCACTCTGC AGCTCACCAT GCAGCTGATG CCAAAGTTGT GGTGTCCAGT GTGCAGCAGC 1380 

CCTGGGAGCC ACTGCCACCT TCAGAGGGGT TCCTTGCTGA GACCCACATT GCTTCACCTG 1440 

GCCCCACCAT GGCTGCTTGC CTGGCCCAAC CTAGCGTTCT GTGCCA1GCT AGAGCTTCAG 1500 

CTGTTGCTCT TCTTCAGGGG AGGAAATAGG GTGGAGAGCG GGAAGGGTCT TGCTCCTAAG 1560 

TGTOGCTGCT GTGGCTTTTT TGCCTTCTCC AAAGACGCAC TCCCAGGTCC CAAGCTOCAG 1620 

25 ACTGCTGTGC TTAGTAAGCA AGTGAGAAGC CTGGGGTTTG GAGCCCACCT ACTCTCTCGC 1680 

AGCATCAGCA TCCTACTCCT GGCAACATCA GGCCAACGTC CACCCCAGCC TCACATTGCC 1740 

AGATGTTGGC AGAAGGGCTA ATATTGACCG TCTTGACTGG CTGGAGCCTT CAAAGCCACT 1800 

GGGATGTCCT CCAGGCACCT GGGTCCCATG ACCAGCTCCC CGTCTCCATA GGGGTAGGCA I860 

TTTCACTGGT TTATGAAGCT CGAGTTTCAT TAAATATGTT AAGAATCAAA GCTGTCTTTG 1920 

TTCAGGCTGC TATAACAAAA ATATAATAGC CTGGGTGGCT TAAAC 1965 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 

AGTGATCCCC TTGCCTCGGC CTCCCAAAAT GCTGGAATTG TAAGCGTGGG CCTCTGCACC 60 

CGGCCTGGTC CGCAATTTAA AAACGCACAG CCACCATTCC CTYTCCAGAA AGCACCCAGA 120 

TGCCTTTGGG AGAACCAGCC TCCTCCATGG AGGAAAGCTT GGGATCTGCC TTCCCACCTG 180 

GGGAGGAGAG GGATCTGTGG AAAATCCTTC TGACGGACTT CCCCTCAGTG CCTGATCCAT 240 

ACTCAATAGT AGAAAAAGTA AGAAATATAC AAAGATAGCA GATACACGGA GACAGTTQCC 300 

60 CAAATAGCTG AGCGAWTAGC GCAGAAGCAA TATTGAAGAC CTAATAGCIX3 AGACATTTCC 360 
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AGAACTGATA AACTOCATCC AGCCACAGAT CAAGCAGCCC AGAAAATTCC AGGCAGCATC 420 

AACAAATAAA TAGCCCCACA TGCACCCGTG AAAATGCAGA AGACCAAACA AAAAAGTCCG 480 

GTCAACAGCC AGAGTTAAAG AGG 503 

(2) INFORMATION FOR SEQ ID NO: 118: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1133 base pairs 
15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS; double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 

GGCACAGCTT GGAATGAACC CCTGTGGATA AGGGGGACTA TTAGATAGAA TAAACATCAA 60 

TAAATGCTTG ATGAATAAAC GCTAATCCTA CCTTCCCAGC CTGACACCTC CCAGTGGACA 120 

25 CCACACTTCA CTTGAAGCCT TAGAAACCTT TCCCACCCAT GCTTCCAGCC CTGGCTTCAT 180 
GITGCCATTT CTCACCCCCA GAACAGGCCG CCCGCCTGAA GAAACTACAA GAGCAAGAGA 
AACAACAGAA AGTGGAGTTT CGTAAAAGGA TGGAGAAGGA GGTGTCAGAT TTCATTCAAG 

ACAGTGGGCA GATCAAGAAA AAGTTTCAGC CAATGAACAA GATCGAGAGG AGCATACTAC 360 

ATGATOIX3GT GGAAGTGGCT GGCCTGACAT CCTTCTCCTT TGGGGAAGAT GATGACTGTC 420 

35 GCTATCTCAT GATCTTCAAA AAGGAGTTTG CACCCTCAGA TGAAGAGCTA GACTCTTACC 480 

GTCGTGGAGA GGAATGGGAC CCCCAGAAGG CTGAGGAGAA GCGGAACNTG AAGGAGCTGG 540 
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CCCAGAGGCA ANGAGGAGGA GGCAGCCCAG CAGGGGCCTG TGGTGGTGAG CCCTGCCAGC 
GACTACAAGG ACAAGTACAG CCACCTCATC GGCAAGGGAG CAGCCAAAGA CGCAGCCCAC 660 
ATCCTACAGG CCAATAAGAC CTACGGCTGT KTGCCCGTGG CCAATAAGAG GGACACACGC 720 
45 TCCATTGAAG AGGCTATGAA TGAGATCAGA GCCAAGAAGC GTCTGCGGCA GAGTGGGGAA 780 
GAGITCCCGC CAACCTCCTA GGCGCCCCGC CCAGCTCCCT TTGACCCCTG GGGCAGGGCA 
GGGGGCAGGG AGAGACAAGG CTGCTGCTAT TAGAGCCCAT CCTGGAGCCC CACCTCTGAA 
CCACCTCCTA CCAGCTGTCC CTCAGGCTGG GGGAAAACAG GTGTTTGATT TGTCACCGTT 
GGAGCTTGGA TATGIGCGTG GCATGTGTGT GTGTGTGTGA GAGTGTGAAT GCACAGGTGG 1020 
55 GTATTTAATC TGTATTATTC CCCGnTCTTG GAATTTTCTT CCCATGGGGC TGGGGTACTT 



840 
900 
960 



1080 



TACATTCAAT AAATACTGTT TAACCCAAAA AAAAAAAAAA AAAAGAAAGA AGN 1133 
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(2) INFORMATION FOR SEQ ID NO: 119: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1101 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : doilble 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 
GGGCACAGCT GAAGCTGCAG ACCTCCCCAG GGGATGGCTC CTCTCCCCCA GGAGCCCCGA 
GGCAGGGGAG GCAGAAAGCC TGGGCTCTGG GGGGTGGCCT GCGGACAGCT GTXK^rC?IGGG 
CCGGGGGCTG GGCCTGTCCC ACAGGGNCGT GGAC3CTCGTG GTTCTGAGCA GCCAGCTCGG 
TGGTGTCTGG GGATAGCTGG GAGGCACAGC GGCTGCCATG TGGGACTCGG ACTGGAGTGC 
20 TCCCTGGTCT TGGCCTCTGT GGCTCAGCCT TGCTCTGGTC TCCCTGAGTG CAGGGGCCAA 
GGGGCACAGG GCCAGTGAGG CCGGCCACGC TCGGGCCCTC ACCTGTGAGA TGGGGTCGGA 
ATTTKACACA GCCTANGGCT TGGTTCTTGG TKGTNGAMCG TCGAdYCTK AGAACGGGAG 
TGCTGGTCCT .GAAAGGCGTG GTTGGAGACC AGCTGCTTTT CTCGCTCTTT TTCTCTTAGG 
AGATTAAACA AAAACAGAAA GCACAAGACG AACTCAGTAG CAGACCCCAG ACTCTCCCCT 
30 TGCCAGACGT GGTTCCAGAC GGGGAGACGC ACCTCGTCCA GAACGGGATT CAGCTGCTCA 

ACGGGCATGC GCCGGGGGCC GTCCCAAACC TCGCAGGGCT CCAGCAGGCC AACCGGCACC 660 
ACGGACTCCT GGGTGGCGCC CTGGCGAACT TGTTTGTGAT AGnGGGTTT GCAGCCITIG 720 
CTTACACGGT CAAGTACGTG CTGAGGAGCA TCGCGCAGGA GTCAGGCCCA GGCGCCGAGA 780 
CCCAAGGCGC CACTGAGGGC ACCGCGCACC AGAGCGTGAC CTCGGCAGGC IGGACACACT 840 
40 GCCCAGCACA GGCAGACCCA CCAGGCTCCT AGGTTTAGCT TTTAAAAACC TGAAAGGGGA 900 



AAACTTTGGG GGGGGGCCCC N 



(2) INFORMATION FOR SEQ ID NO: 120: 

55 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 282 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

60 
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AGCAAAAACC AAAATGTGTG ACTGGGCTTT GGAGGAGACT GGAGCCTCAG CCCTGTCCTG 960 
GCCACGGGCC GCTGGGGCTG GTGTGGGTGG GCCTTGTGTG CTGGATTTGT AGCTTATCTT 1020 
CCGTGTTGTC TTTGGACCTG TTTTAGTAAA CCCGTTTTTC ATTTTAAAAA AAAAAAAAAA 1080 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 
AGCrrCTCTG TCCAGTCTTG AACTCTGGGS TCTCTTGGAA CTTTCCTCAC CCCTCTCAGC 
CTGAATATTC CTTCCATGGA TTCCACTCAA CCAGACTTTG GATCTGTGCC TACTTAATCA 
ACCTTATCTT TGCAATATGT TCGGGCCCAC CTTCCACTCC TTCGTTCTTG TTCCTCCTTG 
GCCTAACTTG TCCCTTCTCC ACTTCACATC CCCGGTGGGA CAGCATTCCT CCTTCCTCCC 
AACCTCCCTC CGTCTCARAA AAAAAAAAAA AAAAAAAAAA TT 

(2) INFORMATION FOR SEQ ID NO: 121: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2635 base pairs 

(B) TYPE: aucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 
TAAGGGGGTG TGTGCTCACC TCCTCCTGAC CCTTAACACT CCTGTCCTGC CCAGACCAAC 
AGAGAGAGCT GTCCCTGAGA CCCCGGAGAG AAGCAGCTGC CGAAAGCTGC AGCCTTTCCG 
CACTCTGAGA dCATGATCTT CCTCCTGCCA GGGGAGAGCC ACCCACAGGC CATGTCCAGC 
CCCACTTCCC TCAGCCCCCA GGGYTTCCTT CTGGCCCCTC TGAGGATTCC CTAGGGCTGC 
CCCGCAGAGG GGYTTCCCCA AGCTCTGTTT TGAAGCCTGC AATGTGGAAA AGTGAGAAGT 
CAGAGGGAAC AGGACAGGTG CAGCCGGGCT CTGAGGCCAC ACCTCACACC TCGCTGTTCC 
CCAACATCCC CTGAGCAGTG TGAGCTCATC TCACCAGATG AGAAGAGGCC CTGTGCATTT 
YTTTTGTTTG TTTGTTGCTG TTTTCCCCCA CCCATCCAGT TCTCCTCAGC AAAGCAAATT 
CCTTAACACC TTTGGTGGAG AATTTCTTAC CCAGACTTGG GGCTGTGATG CCCTTCAGTG 
CGTGGTGAGT GCAGCGTGTG TGCGTGTGCC TGTGTGTGAA CCTGGGGGCC ATCCTGGTGG 
CCTGGGAGCG TGAGGAGAGG CCCCCTGTGT GCTGGGTGAG TGGTGGGTGT GGGGTCAATG 
CAGTGAGGCT CTCTGGGTGA GGCTCCCAAC CTGGCAGTCC CCAGCCTCCC AGCATCTGTG 
AGCGTCTGTT GGACTTTACA GAAGAGCCTC ATCCYGTCTG CCCCTCACTC TGCCCTGGAA 
TCAACATCTT CCGAGTCCTT CTTGGGGGAA ATAGCAGAGC CCCACTTAAC TCCATAAACT 
GCTTCCCATT CCGCAGCCCA GTTCTGATTG TTGAGGTGTC GCGTCGTTCC AGGTCCCCCA 
GTCCCCTCTT TCTCCTGTCC TCTCTCTGTC CTTCACCTCC CCACTCCAGC CCCGGCTCAG 
TTCAGGGAAA TGCTGTTCCA YATCAGCCCT CTGCTCTCTG AGGCAGCCGC GCCTCTGACT 
CGGAGCTACT TGAAACTTCT GCTCTTGCTA GGATTGGAGT CTACCTATCT CTTCCATTTG 
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TCCCAC3CTGG AGTTCTGGAA CTTTCCTCCT CGGGGT33GG GTGGGGTTrG 
TGGGGGGCCT GGGGAAGGAA GGAC7ITCAGA GGA.ir-GGICT CCCCTCrrCCT 
CCCTCCGCTC CTGGGACACG TGCTCTCTCT GTCTCTC-GGT CTTCrGGCTG 
TGTGTCCTTG TAAATATGTT TTAGGAAGAA ;iGC-J^.3GG ACrGAATTAG 
GATTGCAGGG GTCCAGCCTT GCCTGTTTCC GAAGCCCCCA CACTGCrTTT 
AGACTGGTCC CCTCAAAAGG TAGACAAAAC AGGiGCrCCC TCTGGAGCTG 
TCAAAGTGGC TTrTTGTTAG ACAAGGTTAA GGTTTCCTCA TGAGCA.-^7r 
TCCTTCCTCA GCTCCTTGAT TTGTGACCTT GACG^jGGGG CCTGCCACCC 
GTGCCCTCTC CTCGATGCCT CGCTCCTTCC TGCCCCCACT CCCOIQC^CTT 
GGGAATTAGG GCCATGCTGG AAGAAGCTTA ACCATGTGTT CAAAGA.-rGG 
GCTTGGTCCT GGAACTCCCC TTGGCTGCCC CAC<:-CCTCCT TGGCCCATGG 
AGGTGGATGT CAGATCTGGT AGGTTGCAGC AGAGAAAATA AATGTGCrrr 
TCAGAGAGGG TCCAAGGGTG ATGGAGAAGG AAGCATX?3CC IGGGAGCTTG 
GTGGTGGGTG GCGGCATCTT GACTGCCCCC TGTTGTCCCA CACGTGGGGG 
CTCTTCACTC CAGCCCGCCT GCCTTCAGCC TTCC=.TGAGC TTCACCTGrT 
CTTTGGAGGG GGTGGGGTCC GTTGGCATCA ACACC-GGGAC CCTCTGCTTC 
GAGCCCTCAG CCCCTGGGGA G?AC?J^ZGG CTG^GCTTTG ATACCTGGGG 
GCTGCGGGCT GGCGGCAGTC CCAGGGGAGA GACACC^uCAG AAGGAGACCC 
AGGAAGTTCC CAGCAGAGCA AACTGCTTTC CAGCCTGAAG CCTGCTT.-.iA 
TGCAATAACT GAGCTTAGAG TTAGGAATTG TGTTCAAGTG CTTGGATTrc 
TTTAACTGCT GAAATTGTAT CTCTCAGTAA ITITAGArGT CTT7rA.-_-_-A 
AAAGTGTTAG ACTGTGTGCG TGTGCGTTGA TGCC-CACTCA AGAGTCCr^ 
GCCCTGCCTT TCCCCTGCGC CCCCATCCTC ICACGTCCCG CCC/GCCTCC 
CCTGCCTCGT GTCGTCTTTA TCTGCCTATT ACTCAGCCTA AGGAAAG^:iG 
ACATGCATAA AGGAAATCAA ATGTTATTTT TAAGAAAATG GAAAAT.^AAA 
CACCAAAAAA AAAAAAAAAA ACCC^R3GGGG GGGC-CCX3GTA ACCCATTTCG 



60 



(2) INFORMATION FOR SEQ ID NO: 122: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 994 base pair 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 
GAATTCGGCA GAGGTTCGGC GAAGATAGGG AATAAGGAAG CACAGGAGTA GGGGAGAAGG 
AAGCACAGGA GTAGGGGAGA TATACAGCGG TCAGGATAAG GGGGAAAGGG CGGTGGTTGC 
SCAAGAGGTG AAACAAGATG TGAGAGACAA GGGGTAGGGA AGAAATGGGG CAGCGGTTAG 
GTTCAGAAGC GCATAGACCG TGGCGGACGG GCAATGCGAG GGGCACAGAA AGGAACTGAG 
GGGTGGGCTA TTTTAARGGA GATGGTCCTT CAGCCCTCTT YTTTTCTGCG TAGTTCTCCT 
CCTCCAQGCC GCGCGCGGAT ATGTCGTCCG GAAACCAGCC CAGTCTAGGC TGGATGATGA 
CCCACCTCCT TCTACGCTGC TCAAAGACTA CCAGAATGTC CCTGGAATTG AGAAGGTTGA 
TGATGTCGTG AAAAGACTCT TGTCTTTGGA AATGGCCAAC AAGAAGGAGA TGCTAAAAAT 
CAAGCAAGAA CAGTTTATGA AGAAGATTGT TGCAAACCCA GAGGACACCA GATCCCTGGA 
GGCTCGAATT ATTGCCTTGT CTGTCAAGAT CCGCAGTTAT GAAGAACACT TGGAGAAACA 
TCGAAAGGAC AAAGCCCACA AACGCTATCT GCTAATGAGC ATTGACCAGA GGAAAAAGAT 
GCTCAAAAAC CTCCGTAACA CCAACTATGA TGTCTTTGAG AAGATATGCT GGGGGCTGGG 
AATTGAGTAC ACCTTCCCCC CTCTGTATTA CCGAAGAGCC CACCGCCGAT TCGTGACCAA 
GAAGGCTCTG TGCATTCGGG TTTTCCAGGA GACTCAAAAG CTGAAGAAGC GAAGAAGAGC 
CTTAAAGGCT GCAGCAGCAG CCCAAAAACA AGCAAAGCGG AGGAACCCAG ACAGCCCTGC 
CAAAGCCATA CCAAAGACAC TCAAAGACAG CCAATAAATT CTGTTCAATC ATTTAAAAAA 
AAAAAAAAAA AAAAAAAAAA AAAAAGGGGA GGGG 

(2) INFORMATION FOR SEQ ID NO: 123: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1542 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear . 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 
GGCASAGCCA CCTCGGCCCC GGGCTCCGAA GCGGCTCGGG GGCGCCCTTT CGGTCAACAT 
CGTAGTCCAC CCCCTCCCCA TCCCCAGCCC CCGGGGATTC AGGCTCGCCA GCGCCCAGCC 
AGGGAGCCGG CCGGGAAGCG CGATGGGGGC CCCAGCCGCC TCGCTCCTGC TCCTGCTC.CT 
GCTGTTCGCC TGCTGCTGGG CGCCCGGCGG GGCCAACCTC TCCCAGGACG ACAGCCAGCC 
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CTGGACATCT GATGAAACAG TQCTCGGCTOG TGGCACCX5TX3 GTGCTCAAGT GCCMGTGAA 
AGATCACGAG GACTCATCCC TGCAATGGTC TTAACCCTCC TCAGCAGACT CTCTACITrc 360 
GGGAGAAGAG AGCCCTTCGA GATAATCGAA TTCAGCTCGr TAMCTCTACG CCCCACGAGC 420 
TCAGCATCAG CATCAGCAAT GTGGCCCTGG CAGACGAGGG CX3AGTACACC TGCTCAATCT 



480 



TCACTATGCC TGTGCGAACT GCCAAGTCCC TCGTCACTGT GCTAGGAATT CCACAGAAGC 540 



600 



CCATCATCAC TGGTTATAAA TCTTCATTAC GGGAAAAAGA CACAGCCACC CTAAACTGTC 

AGTCPTCTGG GAGCAAGCCT GCAGCCCGGC TCACCTCGAG AAAGGGTCAC CAAGAACTCC 660 

ACGGAGAACC AACCCGCATA CAGGAAGATC CCAATGGTAA AACCTTCACT GTCAGCAGCT 720 

CGGTGACATT CCAGGTTACC CGGGAGGATC ATGGGGCGAG CATCGTGTGC TCTGTGAACC 780 

ATGAATCTCT AAAGGGAGCT GACAGATCCA CCTCTCAACG CATTGAAGIT TTATACACAC 840 

CAACTGCGAT GATTAGGCCA GACCCTCCCC ATCCTCGTGA GGGCCAGAAG CTGTTGCTAC 900 

ACTGTGAGGG TCGCGGCAAT CCAGTCCCCC AGCAGTACCT ATCGGAGAAG GAGGGCAGTC 960 

TGCCACCCCT GAAGATGACC CAGGAGAGTG CCCTCATCTT CCCITrCCTC AACAAGAGTC 1020 

ACAGTGGCAC CTACGGCTGC ACAGCCACCA GCAACATGGG CAGCTACAAG GCCTACTACA 1080 

30 CCCTCAATGT TAATGACCCC AGTCCGGTGC CCVCCTCCTC CAGCACCTAC CACGCCATCA 1140 

TCGGTGGGAT aTKXXrTTTC ATTGTCTTCC TGCTGCTCAT CATGCTCATC TTCCTOSGCC 1200 

ACTACTTGAT CCGGCACAAA GGAACCTACC TGACACATGA GGCAAAAGGC TCCGACGATC 1260 

CTCCAGACGC GGACACGGCC ATCATCAATG CAGAAGGCGG GCAGTCAGGA GGGGACGACA 1320 

AGAAGGAATA TTTCATCTAG AGGOSCCTGC CCACTTCCTG CGCCCCCCAG GGCCCTGTGG 1380 

GGACTTGCTG GGGCCGTCAC CAACCCGGAC TTGTACAGAG CAACCGCAGG GGCCGSCCCT 1440 

CCCGNTTGTT CCCCAGCCCA CCCACCCCCT TGTTACAGAA TGTYTKGTTT GGGGTGCGC?r 1500 
TTTGTWATTG GTTTNGGATN GGGGAAGGGA GGGANGGCGG GG 



45 



1542 



(2) INFORMATION FOR SEQ ID NO: 124; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1390 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 



CAAGCTCTAA TACGACTCAC TATAGGGAAA GCTOGTACGC CTGCAGGTAC CGGTCCGGAA 



60 
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TTCCCGGGTC GACCCACGCG TCOGGGCCTC AGGGTGGACG CATGGTTCTG CACTGAGGCC 12 0 

CTCGTCATGG TGGCGCCTGT GTGGTACTTG GTAGCGGCGG CTCTGCTAGT CGGCTTTATC 180 

5 CTCTTCCTGA CTCGCAGCCG GGGCCGGGCG GCATCAGCCG GCCAAGAGCC ACTGCACAAT 240 

GAGGAGCTGG CAGGAGCAGG CCGGGTGGCC CAGCCTGGGC CCCTGGAGCC TGAC3GAGCCG 300 
AGAGCTGGAG GCAGGCCTCG GCGCCGGAGG GACCTGGGCA GCCGCCTACA GGCCCAGCGT " 360 

10 

CGAGCCCAGC GGGTGGCCTG GGCAGAAGCA GATGAGAACG AGGAGGAAGC TGTCATCCTA 420 

GCCCAGGAGG AGGAAGGTGT CGAGAAGCCA GCGGAAAYTC ACCTGTCGGG GAAAATTGGA 480 

15 GCTAAGAAAC TGCGGAAN>3T GGAGGAGAAA CAAGCGCGAA AGGCCCAGCK TGAGGCAGAG 540 

GAGGCTGAAC GTGARGWGCG GAAACGACTC GAGTCCCAGC GCGAATGAGT QG?J^PMGA 600 

GGAGGAGCGG CTTCGCCTGG AGGAGGAGCA GAAGGAGGAG GAGGAGAGGA AGGCCCGCGA 660 

20 

GGAGCAGGCC CAGCGGGAGC ATGAGGAGTA CCTGAAACTG AAGGAGGCCT TTGTGGTGGA 720 

GGAGGAAGGC GTAGGAGAGA CCATGACTGA GGAACAGTCC CAGAGCTTCC TGACAGAGTT 780 

25 CATCAACTAC ATCAAGCAGT CCAAGGTTGT GCTCTTGGAA GACCTGGCTT CCCAGGTGGG 840 

CCTACGCACT CAGGACACCA TAAATCGCAT CCAGGACCTG CTGGCTGAGG GGACTATAAC 900 

AGGTGTGATT GACGACCGGG GCAAGTTCAT CTACATAACC CCAGAGGAAC TGGCCGCCGT 960 

30 

GGCCAACTTC ATCCGACAGC GGGGCCGGGT GTCCATCGCC GAGCTTGCCC AAGCCAGCAA 1020 

CTCCCTCATC GCCTGGGGCC GGGAGTCCCC TGCCCAAGCC CCAGCCTGAC CCCAGTCCTT 1080 

35 CCCTCTTGGA CTCAGAGTTG GTGTGGCCTA CCTGGCTATA CATCTTCATC CCTCCCCACC 1140 

ATCCTGGGGA AGTGATGGTG TGGQCAGGCA GTTATAGATT AAAGGCCTGT GAGTACTGCT 1200 

GAGCTTGGTG TGGCTTGGTG TGGCAGAAGG CCTGGCCTAG GATCCTAGAT AAGCAGGTGA 1260 

40 

AATTTAGGCT TCAGAATATA TCCGAGAGGT GGGGAGGGTC CCTTGGAAGC TGGTGAAGTC 1320 

CTGTTCTTAT TATGAATCCA TTCATTCAAG AAAATAGCCT GTTGCAAAAA AAAAAAAAAA 1380 

45 AAAAACTCGA 1390 



50 (2) INFORMATION FOR SEQ ID NO: 125; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1288 base pairs 

(B) TYPE: nucleic acid 
55 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: 



60 



GGCGCGCGGG TGAAAGGCGC ATTGATGCAG CCTGCGGCGG CCTCGGAGCG CGGCGGASCA 



60 



wo 98/54963 



PCT/US98/11422 



378 



GACGCTGACC ACGTTCCTCT CCTCGGTCTC CTCCGCCTCC AGCTCCGCGC TGCCCGGCAG 120 

CCGGGAGCCA TGCGACCCCA GGGCCCCGCC GCCTCCCCGC AGCGGCTCCG CGGCCTCCTG 180 

5 

CTGCTCCTGC TGCTGCAGCT GCCCGCGCCG TCGAGCGCCT CTGAGATCCC CAAGGGGAAG 240 

CAAAAGGCGC ATCCGGCAGA GGGAGGTGGT GGACCTGTAT AATGGAATGT GCTTACAAGG 300 

10 GCCAGCAGGA GTGCCTGGTC GAGACGGGAG CCCTGGGGCC AATGGCATTC CGGGTACACC 360 

TGGGATCCCA GGTCGGGATG GATTCAAAGG AGAAAAGGGG GAATGTCTGA GGGAAAGCTT 420 

TGAGGAGTCC TGGACACCCA ACTACAAGCA GTGTTCATGG AGTTCATTGA ATTATGGCAT 480 

15 

AGATCTTGGG AAAATTGCGG AGTGTACATT TACAAAGATG CGTTCAAATA GTGCTCTAAG 540 

AGTTTTGTTC AGTGGCTCAC TTCGGCTAAA ATGCAGAAAT GCATGCTGTC AGCGTTGGTA 600 

20 TTTCACATTC AATGGAGCTG AATGTTCAGG ACCTCTTCCC ATTGAAGCTA TAATTTATTT 660 

GGACCAAGGA AGCCCTGAAA TGAATTCAAC AATTAATATT CATCGCACTT CTTCTGTGGA 720 

AGGACTTTGT GAAGGAATTG GTGCTGGATT AGTGGATGTT GCTATCTGGG TTGGCACTTG 780 

25 

TTCAGATTAC CCAAAAGGAG ATGCTTCTAC TGGATGGAAT TCAGTTTCTC GCATCATTAT 840 

TGAAGAACTA CCAAAATAAA TGCTTTAATT TTCATTTGCT ACCTCTTTTT TTATTATGCC 900 

30 TTGGAATGGT TCACTTAAAT GACATTTTAA ATAAGTTTAT GTATACATCT GAATGAAAAG 960 

CAAAGCTAAA TATGTTTACA GACCAAAGTG TGATTTCACA TGTTTTTAAA TCTAGCATTA 1020 

TTCATTTTGC TTCAATCAAA AGTGGTTTCA ATATTTTTTT TAGTTGGTTA GAATACTTTC 1080 

35 

TTCATAGTCA CATTCTCTCA ACCTATAATT TGGGAATATT GTTGTGGTCT Tr rG TTT lT T 1140 

CTCTTAGTAT AGCATTTTTA AAAAAATATA AAAGCTACCA ATCTTTGTAC AATTTGTAAA 1200 

40 TGTTAAGAAT ' I ' lTlTlTA TA TCTGTTAAAT AAAAATTATT TCCMACAACC TTAAAAAAAA 1260 

AAAAAAAAAA AAAAAAAAAA AAAAANAA 1288 



45 

(2) INFORMATION FOR SEQ ID NO: 126: 

(i) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH: 1517 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

55 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126: 

AGTGGCTTAA AGGCATCGTT TTAGGGATTA CTGGGAAGTA TCTTCAAAGT AATACATGAG 60 
AAACATTCCT TCCTAAATCC TTTATTATAT TGAATATCGT ATTAATTGGT TTTCAGAGGT 120 

60 
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379 

TAAATTAACC ATCTATTCCT GCAATAAATG TCACITOTNT CTTGTATATA ATCTTTTTTA 180 

TATATTACCG GATTGATTCA TTAGTATTTT GTTGAGGATT TTTGTGTCTA TATTCATAAG 240 

AGATGCTCGT CTCCAGTrTT CTTTTTTTGT GATAATCTGG ITTTTGTATC AGTAATACAG 300 

GCCCCATCAA ACGAGTKSGG AAGTGTTCAC CTCTCTTGTA rmTTCAAG AGTTTGTGAA 360 
GAATTGCTAT TAATTCTTTA AATGTTTGGT AGAATCTACC ATTGAAATCA TGTGTCCTGG ■ 420 

GCTTTTTTTT GAGGGAAGTG TTCTGATAAC TAATTCAGTA TCTACTTTTT ATAGCTCTGT 480 

TCAGATTTTG CTTCTrCCTC ACTrAGTTTT GGTAATTrOT GTATCTCTAG GARITTGTCC 540 
15 ATTTCATITA TCTCATTrGT TGGCATAAAT TAAACTAAAT TTGGCCTGAG CCTACCTGTA 
TATCTTGAGT CCCTCTGTAA GGAACTGTAG CCTAACrTGT ACATAAACAA ACTGAAATCC 

TAAATTAGGA ATGTAGTTTT TGTAACAGCT CCTGAGTCTC AGGCAGTCAC AGCAGYCAAG 720 

TCTGTCAATT GCAGGCTCCT AACTAAGCAG CCCATGSTCA AATGAGGCAA AAACCTTTGC 780 

TTTTAACACA TAGTATAGCT TTCTAATCCT TTTCTTGCAC ACTCGGGTAA rrTCTTCCrr 840 

25 TTTCArrccc KGWAnrrcc akgaatatga rtctyccttt tttcccctcc tgtcagtcta 

GCTAATGGTT TGTCAATTTT GTTGATCTTT TGAARAACAA ACCTTTGGTT OIACTTTCTT 

GITCCATATG CTCARTATTC TCATAATTGG AGTGGAAAGC TGATCTTTGA TTACTTAITr 1020 



20 



30 



40 



50 



(2) INFORMATION FOR SEQ ID NO: 127: 



(i) SEQUENCE CHARACTERISTICS: 
55 (A) LEI^STH: 1073 base pairs 

(B) TYPE: nucleic acid 

(C) STRAKDEDNESS: double 

(D) TOPOLOGY: lineau: 

60 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 



600 
660 



900 
960 



1080 



TACTTAGGGC TGAGGAGTTC ATGGACTTCG CAAAACCTCC TTGAATCTAA attgcatctt 

CTTTCCIGGT TTCTGGGCTG AAACATGTTT TTTCCCATCT WANAWACCCT TGOTCTTTTC 1140 

35 ATKGGCGATT AAGACTAGAG AAAGTTCTAG AIWCCTIGTC CTTTTATGCT GTCATTTTGT 1200 

TTAAAGGCTT TCTATGTAGT AAAACTATCT ATATAGACAA AATAGAGCCT TGAGircrGG 1260 

TCTTGAATTT GATCAACATG ATTTACCACA TTCTGTACTG GATATTTCTT CACCTGCTGC 1320 

TACTGTAAAC CATTTTATTC TTCGATCTTC TGTAGAGTAT ATTATCACAG GTACrmTA 1380 

CAGGGGTOTC TAATCnTTG GCnCCCTGG GCACATTGAA AGAAGAAGAA TTGTCITCGG 1440 

45 CCACACATCA AATACGCTAA CACTAATAAT AGTTGATGAG CTAAAAAAAA AAAAAAAAAG 1500 
GCAAAAAAGN CCCAAAA 
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TGAATCTATT CTTTGAACAT TCTACAACAA GAATTACATT ATACTGTTAT ACCAGAGTAC 60 

TTCTGCAGTG TGAAATAGAT TGGTTTGGAA AATGAACCTG GCTTTGCTAT AAATTACATT 120 

5 

CirAGGCCTT TTTGCAAATG TGTAACTTGC CTATCAAAGT AGTTTGTAGG GCAAATGCAG 180 

AATATATGTC TCCATCTGGT AAAGTACCTT WTAYTCATGT GGGAAATCAA GTAGTATCAG 240 

10 AACTTOGTCC AATAGTCCAA TTTGTTAAAG CCAAGGGCCA TTCTCTTAGT GATGGGCTGG 300 

AGGAAGTCCA AAAAGCAGAA ATGAAAGCTT ACATGGAATT AGTCAACAAT ATGCTGTTGA 360 

CTGCAGAGCT GTATCTTCAG TGGTGTGATG AAGCTACAGT AGGGRMGATC ACTCATGMTA 420 

15 

GGTATGGWrC TCCTTACCCT TGGCCTCTGW VmZATATTTT GGCCTATCAA AAACAGTGGG 480 

AAGTCAAACG TAAGNTGAAA GCTATTGGAT GGGGAAAGAA GACTCTGGAC CAGGTCTTAG 540 

20 AGGATCTAGA CCAGTGCTGT CAAGCTCTCT CTCAAAGACT GGGAACACAA CCGTATTTCT 600 

TCAATAAGCA GCCTACTGAA CTTGACGCAC TGGTATTTGG CCATCTATAC ACCATTCTTA 660 

CCACACAATT GACAAATGAT GAACTTTCTG AGAAGGTGAA AAACTATAGC AACCTCCTTG 720 

CTTTCTGTAG GAGAATTGAA CAGCACTATT TTGAAGATCG TGGTAAAGGC AGGCTGTCAT 780 

AGAGTTATGT GTTAGTCTCA GGAGTCTTAA CTTTTGAAAT ATGTTTTACT TGAATGTTAC 840 

30 ATTAGATATT GGTGTCAGAA TTTTAAAACC AAATTACTGC TTTTTGAAAC CTCAAATTAT 900 

ATAATGTATC TTATGTATGT GCTTTATATT GTTATTTGTG TATACATTAA AATAATTCTG 960 

AATTATTTAA TCTGATATGT TGTATTCTGT ATCTTGAAAT 1TTTGTTTCC TTGAAACATG 1020 

35 

CATGCATTTA AAAATAAAGC TTAAACAACT GTAAAAAAAA AAAAAAAAAA CTC 1073 



40 

(2) INTOmATlCXl FOR SEQ ID NO: 128: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 300 base pairs 
45 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOIOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128: 

50 

CAACCCCTGC CTTTTTTTTG TTTTCCATTT GCTTGGTAGA TCTTCCTCCA TCCCTTTATT 60 
TTGAGCCTAT GTGTGTCTCT GCCCGTGAGA TGAGTCTCCT GAATACAGCA CACTTACTGG 120 
55 TCTTGACTCT GTATCCAATT TGCCAGTCTG TGTCTTTCAT TTGGAGCATT TAGCCCATTT 180 
ACATTTAAGG TKAATATTGT TATGTGTGAA TTTRATCYTR TCATTATGWT GTTAGCTGGT 240 
TATTTTGCTT GTTAGTTGAT GCAGTTTCTT CCNGGCATCA ATGGTCTTTA CAAOTTGGCA 300 

60 
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15 



25 



35 



45 



60 
120 
180 



(2) INFORMATION FOR SEQ ID NO: 129: 

5 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 1275 base pairs 
IB) TYPE: nucleic acid 
(C) STRANDEDNESS : doiible 
10 (D) TOPOLOGY: lineair 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129: 
GGCAGAGCCT GTCCCTGCTG CCCCTGCAAA AAAAACCCCC TCTGGTGTGA GCAGGATGGT 
TGGAGGTTAT GTCAGCTCCT TCTCCTTTCC TCCAGTTTCC TCTTCCCTTC TCCTCCCTGC 
CTCTTTTCCT TTTCCdTTC TTCCTGGTAC CCCCTGCCCA TTCCTGTATT TTCTCCCATC 

20 GCCATrCTCC CCTCTCCCAC TGTCCCTAAC CCGTTCAAAC TCTTTCCTCT TAAATGGTTG 240 

AGATITTCTC TCACCAAGCA CACCCCAGTA TTAATTAAAC TAGCTGCAAA CAGGCAGCAA 300 

GTCGTCTACC ATOACAGATG GGTTTTGTGT GTGTGTGTGT GTGTGTAATT GTAATAAAAC 360 

ATATTOARTC ACTCAATAAA CACAGAGTGT CTACTACATG TATCARGCAC TATCATAGAT 420 

GCTAATTAAC GAAACTCAAA TGGCCAGGCC CTCACAGTGG CTCATGCCTA TAATCCCAGC 480 

30 ACTTTGGGAG GATGAGGCAG GAGGATCACT TGAGGCCGGG AGTTCAAGAC CAGCCTGGGC 540 

AACATACJTAA GACTCCATCT CTACAAAAAA AAAATTTTTT TTATTATACT TTAAGTTTTG 600 

GGTTACATGT GCAGAACGTG TAGTTTTGTT ACATAGGTAT ATACGTGCCC TGGTAGTTTG 660 

CTCCACCCAT CAACCCATCA CCTACATTAG GTATTTCTCC TAATGTTACC CCTCTCCTAG 720 

CCCCCCACCC COTGACAGGC CCTGGTGTGT GATGTTCCCC TCCCTGTGTC CATGTGTTCr 780 
40 CATTGGTCAA CTCTCACCTA TGGAGTOAGA ACATGTGGTA TTTGGTTTTC TGATCTTGTG 
ATAGCTTGCT GAGAATGTKG GTTTCCAGCT TTATCCACGT CCCTGCAAAG GGCATAAACT 
CATCCCTTTT TATGGCTCCA TAGTCTTCCA TGGTGTATAC GTGCCACATT TTCTTAATCT 

ATCArrcATC GACAAGTTTT GCTATTGTGA ATAGTGCCAC AATAAACATA CGTGTGCGTG 1020 

TGTCTTTATA GCAGCAIGAT TTATAATCCT TTGGGTATAT ACCCAGTAAT GGGATCACTG 1080 

50 AGTCAAATGG TATTTCTCGT TCTAGATCCG TAAGGAATTO CCACACTGTC TTCCACAATG 1140 

TTTGAACTAA TNTACACTCC CACCAACAGT GTAAAAGTGT TrCTATTTTT CCACAACCTC 1200 

TCCAACATCT GTTATrTCCT GACTTTTTAA TGAACGTCAT TCTAACTGGC GTGAGATGGT 1260 

1275 

ATCTCATTGT GGTTT 



840 
900 
960 
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(2) IbJFORMATION FOR SEQ ID NO: 130: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LEE^GTH: 472 base pairs 
5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) T0POU5GY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 

10 

CNGAAACCCC GTGAACCCTC CCCGGGTTAA AAAGCCCCCC CTAAATGGGG GGAACGCVTC 60 

ACACGTTATA AAAAAGCACT AGAATGTTTT GAAAGCGAGA AACAACAGCT GTGTAGGGTA 120 

15 GCTAGCAGTT AGTGTTGTAC AGAAGACAGA TATTTGTGCA TTTYTGCATT TTCTAAGTTT 180 

GCTGCAATGA GCATGTATTA CTTTCATAGT TATAAAACAC ATGCAAAATG CCCTTTTAAA 240 

ATGAAAAAAA ATCCATGAGT GTAAGTGATA TATATGCTIT GGAAAGCCTG GGACGGTCAT 300 

20 

TGTTTACTCT CAATAGTATG TGTTTGCCTT TGTCTTTTTG AGACATnTG TTTTAATCTG 360 

TTGATGACAA TAACCTGITG ATAATATAAC TTGATAACAA ATAAAATGAC TTATGATTGA 420 

25 AWMAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA NN 472 



30 (2) INFORMATION FOR SEQ ID NO: 131: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1950 base pairs 

(B) TVPE: nucleic acid 
35 to STRANDEDNESS: double 

(D) TOPOLOGY; linear 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 131: 

40 ACCTCTCAGA ATCTTCTCTC AGCAACCTGA GTCTTCGCCG TTCCTCAGAG CGCCTCAGTG 60 

ACACCCCTGG ATCCTTCCAG TCACCTTCCC TGGAAATTCT GCTGTCCAGC TGCTCCCTGT 120 

GCCGTGCCTG TNATTCGCTG GTGTATGATG AGGAAATCAT GGCTGGCTGG GCACCTGATG 180 

45 

ACTCTAACCT CAACACAACC TGCCCCTTCT GCGCCTGCCC CTTTNTGCCC CTGCTCAGTG 240 

TCCAGACCNT TGATTCCCGG CCCAGTGTCC CCAGCCCCAA ATCTGCTGGT GCCAGTGGCA 300 

50 GCAAAGATGC TCCTGTCCCT GGTGGTCCTG GCCCTGTGCT CAGTGACCGA AGCTCTGCCT 360 

TGCTCTGGAT GAGCCCCAGC TCTGCAACGG GCACATGGGG GGAGCCTCCC GGCQGGTTGA 420 

GAGTGGGGCA TGGGCATACC TGAGCCCCCT GGTGCTGCGT AAGGAGCTGG AGTCGCTGGT 480 

55 

AGAGAACGAG GGCAGTGAGG TGCTGGCGTT GCCTGAACTG CCCTCTGCCC ACCCCATCAT 540 

CTTCTGGAAC CTTTTGTGGT ATTTCCAACG GCTACGNCTG CCCAGTATTC TACCAGGC'CT 600 

60 GGTGCTGGCC TCCTGTGATG GGCCTTCGMA CTCCCAGGCC CCATCTCCTT GGCTAACCCC 660 
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TGATCCAGCC TCTGTrCAGG TACGGCTGCT GTGGGATGTA CTGACCCCTG ACCCCAATAG 
CTGCCCACCT CTCTATGTGC TCTGGAC5GGT CCACAGCCAG ATCCCCCAGC GGGTGGTATG 
GCCAGGCCCT GTACCTGCAT CCCTTAGTTT GGCACTGTTG GAGTCAGTGC TGCGCCATGT 
TGGACTCAAT GAAGTOCACA AGGCTGTGGG GCTCCTGCTG GAAACTCTAG GGCCCCCACC 
CACTGGCCTG CACCTGCAGA GGGGAATCTA CCGTGAGATA TTATTCCTGA CAATGGCTGC 
TCTGGGCAAG GACCACGTCG ACATAGTGGC CTTCGATAAG AAGTACAAGT CTGCCTTTAA 
CAAGCTGGCC AGCAGCATGG GCAAGGAGGA GCTGAGGCAC CGGCGGGCGC AGATGCCCAC 
TCCCAAGGCC ATTGACTCCC GAAAATGTIT TGGAGCACCT CCAGAATGCT AGAGACCTTA 
AGCTTCCCTC TCCAGCCTAG GGTGGGGAAG TGAGGAAGAA GGGATTCTAG AGTTAAACTG 
CTTCCCTCTT GCCTTCATGG AGTTGGGAAC AGGCTGGGAA GGATGCCCAG TCAAAGGCTC 
CAAGCGAGGA CAACAGGAAG AGGGATCCAC TGTTACCAAA AGTCCTGATT CCCCCATCAC 
CAACCTACCC AGTTTGTTCG TGCTGATGTT GGGGGAGATC TGGGGGGAGT TGGTACAGCT 
CTGTTCTTCC CTTGTCCTAT ACCGGGAACT CCCCTCCAGG GTACCCACAG ATCTGCATTG 
CCCTGGTCAT TTTAGAAGTT TTTGTTTTAA AAAACAACTG GAAAGATGCA GAGCTACTGA 
GCCTTTGCCC TGAATGGGAG GTAGGGATGT CATTCTCCAC CAATAATGGT CCCTCTTCCC 
TGACGTTCCT GAAGGAGCCC AAGGCTCTCC ATGCCTTTCT ACCTAAGTGT TrGTATTTTA 
TTTTAAATTA TTTAITCTGG AGCCACAGCC CCCTTGCTTA TGAGGTTCTT ATGGAGAGTG 
AGAAAGGGAA GGGAAATAGG GCACCATGGT CCGGTGGTTT GTAGTTCCTT CAAAGTCAGG 
CACTGQGAGC TAGAGGAGTC TCAAGCTCCC CTTAGGAAGA ACTOGIGCCC CCTCCAGTCC 

TAATrrrrcT tgcctccccc gccttgggga atgcctcacc cacccaggtc ctgacctgtg 

CAATAAGGAT TGTTCCCTGC GAAGTnTGT TGGATGTAAA TATAGTAAAA GCTGCTTCTG 
TCTTTTTCAA AANAAAAAAA AAAAAAAACT 



720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1950 



50 



55 



60 



(2) INFOHMATION FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 990 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132: 
TGGAAGATTT AAAATAGGTT TCATATTTCT CTTGAATATG AATATATAAG CTTGAATAAG 
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CTCGAGTCCr TATTATTA-TG AAATTTTCCT TATTATTTCT ACCAATGCTT CTTATATTAA 120 



AGCCTGArCT rTCTCATAT? AC^TATATGTA CATTAGCTGC CTGTGGATTA ACATTTCCAT 180 

5 :^aatg::at7 rTT'3-cArrGT 'ttgatcttaa AcrrrrTGTG tctttatata aggtatgcty 240 



CTTTTAAC-C-. TGATATCTTT AACCACAATA GTTGAAAGAC AATCTYCACC TTTTACTTGT 300 



^.T^^-TTAn^-T GTA-ATGfAAT TTTTCATGCA TATTACGTCT TATTATTTAA CCAACCTATT 360 

10 

TTATTTTA.TC TA&3C-C-.TTT TTCA3AAAGC CTTATTTTCT TGTATTAATC AAATATTTTT 420 



AYC-.TTCTAT nTCCTfZTAT TA3TTAGKAA TACGKTACYC YAAATATATA OrGTCGSTAT 480 
15 ZTTCAGAATT GCAATAT'SCC TCCTTAATTT ATTAGAGGCT AACCTAAATT ATTACTTTTA 540 



CCACTTACTT GAPiA^.nCTG GAi-CTTTAGA ACATTTATTG TTTTATGCAT TTTAATTCTA 600 



CTTSTATTTT TACTACTCCT AA^CATTATT ATTGTTTTAG ACAAGCCAAA ATATATOTTG 660 

20 

TTACTATCn ATYCr-CCATT TCTTTCTGTA TTTTTATGCC ACTATGTATG CTCAATTTCC 720 



TTCrATGTGA Ta=iACCTAAT TG^GTACTTT TGTTTTTTAA TCTGTGCAGG TAGCCTGGCX: 780 
25 ATTi^AATTTT TAmTTQGT TTGCTGAAAA AATTGTGTTT ATTTCTATAT GCATACTTAT 840 
GCATATASAA TMCTAG^TG AC^TA' i ' i ' ITT AGTATTTATA AATGTAAAGT CATTWATTKG 900 



GCTTCTArCA TTTCKCnKGA GA^TCAATT GTCAGCCCAA TAGTTTTTCA TTTTAAATTA 960 

30 

CNGA^TtTTT TCATGrCrCT GGTTTTAGGA 990 



35 

(2) I^:70H:C-.riCN TZR SZQ id NO: 133: 

(i) SZQUrZTZZ CHA--ACTHRISTICS: 

(a; LZr^GTH: 1720 base pairs 
40 (3) TYPE: nucleic acid 

(C) STRAtCEDNESS: double 

(D) TOPOLCGY: linear 

txi) SEQtSICE DESCRIPTION: SEQ ID NO: 133: 

45 

GTCTGATAAG Ca^^CTGTGGT TATTCCCCTA AAGTTTACTT CAGCACTAAC ACTAGTGCTT 60 
CCGCTGGAGT TTGCATTTTT CCAGCTTTAT ACAGGATTTT CCTTTGACTG GAAGAGTCAA 120 
50 GGATATAGAG ACTCAAZAGT anCATTTATT GTACAACATC AAGGGGAATA GGATACTCAT 180 



CAPACTGGGA TTATTCtTAT O^AAACATGG TCTTCTTTGA ATAAGAAAAA TACATAGTTG 240 



GTTATTATGG ACTTAAAACT GCGTTAAATG GATATTCTGA TAAAATATTT GCTGCTCTGT 300 

55 

AGAGTGTCGA AAATCTGAGA ATATTAGCTT TACTCATCTT GAGCTITGAG GATGTTCTCT 360 

GTACGCCGAT GGTTTCATAT TAACTAAAAA AGCTGGGTAT TGTAAAATCT CATTTATAAA 420 

60 AACTCV3ATG AGAAGA?iAAT 'HTCTTTGAT GGTGAGACTG 1TGTCTTAGT TCAGGAAATT 480 
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25 



30 



35 



40 



ATTTAATAAT CCTTTGTTAC CTCTGAATGA AGGAACTTTG TAATTCTGAT TTATCGTAAA 
ACATGAGCCT TTCCAGAGTC AGCTTAGACA CTGTTGTCGC AAATAGCCAT GCnTCCCTT 
A-rcCCAAGGA GGCCCAGAGG GAGGGCCTAG TCTTCCTCTG TTGCTGTACA TATATTGAAA 
TGCTTirrTT TTrTATTTTG CATTTGTTAT CTATAATGAG CTTTCTGAGC CCTGATATTA 
TGTGAGACAA ACAGGAOTTA TTGATGTTAT ACACTCCCTT CCATTCAGGA TTTTCTGCIT 
GGAGGGAAAT ATGTTGACCT TAGAGAATTG TGAATATTGT TGCAATTCTT GAATATATTA 
CCATGTGAAT AATAGAGACT GTCTTGCTCT CTAGTATAAG CTATATTTAT TTTTGATrCA 
TTTGAArrAC TAGTTATAAC TGGAGAAATT TICTTACCTC TATCCTGGCT TGCCTGACTG 
GCTCTATAAT AGCAGCAGCC TCTITTAGAG CATCTTAATG AAAACATGGA TGAAAGGAAT 
TAATCATCAT ATCTCCAGAC TGCGTAGAAA ATGGCTTTTG TTCCCAGCGT TAACATTTTC 
TTCTCAATCA CATITCAATG TTTGTGGAGA GTGGCAGATT CACACCAGAA ACACTAGGTG 
TTCATATCCA TAGCATGGAT GCAGAATAAG CAGTTGGGAG AGAAGCTTCT TCCTACCTGG 
TACTCCrcCC ATTCACCTCA GCCCAGCCCC AGACAGGCGT TAGCATTCAG TGTGGGCCCT 
CAGGCAGCCC TGAAGCCTGG CTGGGTCATC AGATGGGGGC AGCCTGTGAC GGGCACCAGC 
GGCCTCATTC CAGGGAAGAG ITCCTGGAGG GTGTTGGCTG TTTTTGTTAG CTCAGTTTTT 
TTCTGGGCTC CACCATTCCT AACTCCAGGT AGACAAGATA GATGTCACAC ACAACAATTT 
TAAAGTATTT TGCTTAGTGC ATmTCTTTA TGATTGCAGT GTTTGTTrCT TATTTAATAG 
GCnTTTACT TCATTCTATT AAATTTTAGT GTTTAGAAGA GGCGGGTACT GTCACTGTGT 
AAAATATGTA ATATITTATA TGTTATACCA TGTCATATAT ACTTGCAATA TCAGACCTTG 
CATTCAATAT ACAATGCAAT TCACTCrTTG CAGACCTGCA TTTTTCAGTG AACAATAAAA 
AGATTOTCTG GCACTCCAAA AAAAAAAAAA AAAAAAAAAA 



540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1720 



45 



50 



55 



60 



(2) INFORMATION FOR SEQ ID NO: 134: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTTH: 705 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS ! double 

(D) TOPOIOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134: 
GGCACGAGGC CATCTGGGCT CATTCAGCAG GAAATAATGG AAAAAGCTGC AATATCCAGG 
TCTTTACTAC AATCTCGAGG CAAGATCTTT CCTCAGTATG TCCTCATGTT TGGGCTGCTT 



60 
120 
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GTGGAATCAC AGACACTCCT AGAGGAGAAT 
TTAAATATAG CACCTTTTAT TAACCAGTTT 
5 TCCTCATTGC CCTGTATACC TTTAAGCAAG 
ACTCCGTATT TGAACACCTC TAACAGAGAA 
GACTTGACTG CTATTCCATT TTGGGTATCA 

10 

GATACTTCAA GTGAAGCCTC CCACTGGAAA 
CAGGOTGAAA TGGGAGA3GA ACTTGTACTC 
15 ATCACAGTAA AGCAATGAAG AGCAGTTTTC 
AAGTACAAAA TTCTTGTCTT AATTAGTGGG 
TATTTTTTAA AATTGACATT AATAAAGCAT 

20 



GCTOTTCAAG GAACAGAACG TACTCTTGGA 180 

CAGGTACCTA TACGTGTATT TTTGGACCTA 240 

CCAGTGGAAC TCTTAAGACT AGATTTAATG 300 

GTAAAGGTAT ACGTTTGTNA AATCTGGGAA 360 
TATGTACCTT GATGAAGAKX; ATTAGGTTGG ' 420 

CAAGCTCCAG TTGTTTTAGA TAATCCCATC 480 

AGCATTCAGC ATCACAAAAG CAATGTCAGC 540 

CAATGAAAAC TGTGTAAATA GAGCATCAAC 600 

GGTATATAAA AATTCCTTGT AATGGTCAAA 660 

ATTTTAAAAG TTTCT 705 



(2) INFORMATION FOR SEQ ID NO: 135: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 323 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEEWESS : double 
30 (D) TOPOLOGY: lineau: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135: 

AGCACACACC TCCTTTAGTT GCTCCTAAGG TCATGTTCAA CATTCGTGGA GTGCATTTTC 60 

35 

TGCTCAGGGA GCTTTCCCAG ACCCGGAATG TTTGGTGCTC ACAGACYCTG GCAAGGATCG 120 

GTATTGCTGT TCCTCAGTTT TGCCTGGGGA AATGGAGGST CAGTGACGTT CAGTGACGTG 180 

40 CCCAGAGTCA TGCCATTCGC GGGTGGCCCA GKGWTCCAGG TCTCCAGCAC CCCTCGGCCC 240 

CCTCCTCACC AGGTCACATC ATCTCCTGGA TTAGAATCTG CTCACATAGT CTGTCCTGAA 300 

AGGAAAAAAA AAAAAAAAAA AAC 323 

45 



(2) INFORMATION FOR SEQ ID NO: 136: 

50 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 582 base pairs 

(B) TYPE: nucleic acid 

. (C) STRANDEDNESS : do\ible 
55 (D) TOPOLOGY: linear 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136: 
GGACGGAATG GTGCAACCCT CCTWMmTT CTKGKGCTGT TGACAACAGA GGGAGGGAGG 60 
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(2) INFORMATION FOR SEQ ID NO: 137: 



387 

GAAAACATTT TTYGTGGGAG AATCCTACYT CTGCAGSGGA GCCCTTAAGC GATKGATTTT 120 

GAATCTKGAC CCTTTACCAA CTAATITrGA AGGAAGATAC CTTGGAAATA TTTGGCATTC 180 

A£?rcGGTTAC TGAAACAGCA TTAGTGAATT CATCTAGAGA ACTCTTTCAT TTATTCAGGC 240 

AACAACTOTA CAACrTGGAA ACCTTGTTAC AGTCCAGrTTG TGA1TTTGGG AARGTATCAA 300 
CTCTACACTG CAAAGCAGAC AATATTAGGC AGCAGTGTGT ACTATTTCTC CATTATGTTA " 360 

AAGimCAT CTTCAGGTAT CIGAAAGTAC AGAATGCTGA GAGTCATGTT CCTGTCCATC 420 

CTTATCAGGC TrTGGAGGCT CAGCTTCCCT CAGTGTTGAT TGATGAGCTT CATGGATTAC 480 

15 TCrrCTATAT TGGACACCTA TCTGAACTTC CCAGTGTTAA TATAGGAGCA TTTGTAAATC 540 

AAAACCAGAT TAAGGTTTGA CTGGTTTCAT TTGAmTTA AG 582 



(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 1021 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 

TTCGGCAGAG CCCTTGCGCG CTCITCAATA CCTGCKTTCT GTAGCGCTAG TICTCTTCAA 60 

GATTIGCITA GTCTCATTTC ATTTCGGTTT CTTTTCTCGC CATGTmTC TGTCGGAATT 120 

ACGGTTCGrr TTCGTTCTAT GTACTCTCTA AAATGTTATC GTTTTTCATr TGTCTACTAA 180 

nTTCGTGCA riTCTTACTA CTCAGTTTCT TAATATCTGA CTGGCCTCCG CCCACGGGCT 240 

40 CTCCAGANCA TAAAATACTC AGGCTGATGG TAGTGCAGAG ACTCTCCCTC CTTGATCAGC 300 
GCAAACGTTG GTCTOAGGCT TGAGGGATGG AGCAACATTT TCTTGGCTGT GTGAAGCGGG 



360 



CTTGGGATTC CGCAGAGGTG GCGCCAGAGC CCCAGCCTCC ACCTATTGTG AGTTCAGAAG 420 
ATCGTCGGCC GIGGCCTCTT CCTTTGTATC CAGTACTAGG AGAGTACTCA CTGGACAGCT 



480 



OTGATTIGGG ACTGCITrCC AGCCCTTCCT GGCGGCTCCC CGGAGTCTAC TGGCAAAACG 540 



600 



50 GACTCTCTCC TX3GAGTCCAG AGCACCTTGG AACCAAGTAC AGCGAAGCCC ACTGAGTTCA 

GTTGGCCGGG GACACAGAAG CAGCAAGARG CACCCGTAGA AKARGTGGGG CAGGCAGARG 660 

AACCCGACAG ACTCAGGCTC CRGCAGCTTC CCTGGAGCAG TCCTCTCCAT CCYTGGGACA 720 

GACAGCAGGA CACCGAGGTC TGTGACAGCG GGTGCCTTTT GGAACGCCGC CATCCTCCTG 780 

CCCTCCAGCC CTGGCGCCAC CTCCCGGGTT TCTCAGACTG CCTGGAGTGG ATTCTTCGCG 840 

60 rKSGrrTTGC cgcgttctct gtactctggg cgtgctgttc acggatctgt ggagctaagc 



900 
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AGCCTTAGAT AGCAGCAGAA GGCTTTTTGG ATTCTCCTCC TTGAAAAGAT TCTCAGTTAC 960 

CAAACGTCTC CACCTAGAAA ATAAAAATAC ATTAAGATGT TGANAAAAAA AAANAAAAAA 1020 

5 

A 1021 



10 

(2) INFORMATION FOR SEQ ID NO: 138: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1777 base pairs 
15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138: 

20 

CGGAAGATGA TGGCTTCAAC AGATCCATTC ATGAAGTGAT ACTAAAAAAT ATTACTTGGT 60 

ATTCAGAACG AGTTTTAACT GAAATCTCCT TGGGGAGTCT CCTGATCCTG GTGGTAATAA 120 

25 GAACCATTCA ATACAACATG ACTAGGACAC GAGACAAGTA CCTTCACACA AATTGTTTGG 180 

CAGCTTTAGC AAATATGTCG GCACAGTTTC GTTCTCTCCA TCAGTATGCT GCCCAGAGGA 240 

TCATCAGTTT ATTTTCTTTG CTGTCTAAAA AACACAACAA AGTTCTGGAA CAAGCCACAC 300 

30 

AGTCCTTGAG AGGTTCGCTG AGTTCTAATG ATGTTCCTCT ACCAGATTAT GCACAAGACC 360 

TAAATGTCAT TGAAGAAGTG ATTCGAATGA TGTTAGAGAT CATCAACTCC TGCCTGACAA 420 

35 ATTCCCTTCA CCACAACCCA AACITGGTAT ACGCCCTGCT TTACAAACGC GATCTCTTTG 480 

AACAATTTCG AACTCATCCT TCATTTTCAGG ATATAATGCA ' AAATATTGAT CTGGTGATCT 540 

CCTTCTTTAG CTCAAGGTTG CTGCAAGCTG GGAGCTGAGC TGTCAGTGGA ACGGGTCCTG 600 

40 

GAAATCATTA AGCAAGGCGT CGTTGCGCTG CCCAAAGACA GACTGAAGAA ATTTCCAGAA 660 

TTGAAATTCA AATATGTGGA AGAGGAGCAG CCCGAGGAGT TmTATCCC CTATGTCTGG 720 

45 TCrCTTGTCT ACAACTCAGC AGTCGGCCTG TACTGGAATC CACAGGACAT CCAGCTGTTC 780 

ACCATGGATT CCGACTGAGG GCAGGATGCT CTCCCACCCG GACCCCTCCA GCCAAGCAGC 840 

CCTTCAASTT CTTTTATTTC TGGGTAACAG AAGTAGACAG ACAGGTTACT TGGTGTATCT 900 

50 

TCTGTTAAAG AGGATTGCAC GAGTGTGTTT TCCTCACACA CTTTGATTTG GAGAATTGGT 960 

GCTAGTTGGC AATAGATAAC TCAGCGTAGA TAGTATTGCA AAAAGGGGAG GAAATACACA 1020 

55 ACAATAATAA ATGTAAAAAC CTGCTATTCA ACATGCAGTT TTATTTCGAR GCCAAAAATC 1080 

TAGAGCTTTC CCAAGATCCT GITGCCTTAG GCACATNCAC ACTTCAACAG TGCACACTAT 1140 

CCAACAGTGC ACACTATTCA ACAGTGCACA CTATTCAAAA GCGTAGACTA TTTTTTTGCA 1200 

60 
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TGTTCAAGAT ATTTCTrTTG GTCTTATGTG TGTGTGAGAG AGAGAGATTC CTTTGACATT 1260 

AAGGAGCATC AATGAGAAAA GATGATGAGG CAGGAATTAA TAAAGAAATG AAGTCGTGTG 1320 

TOrTTGGTTG CCTGTCAGAG GGCACACAAT TTCATAAACA CCATGCCTGG ACAATTTGAT 1380 

ATTAATATTT AACACCTCTG CATCTTTTTC TTAAAAAAGA ATATGGGCCA GATACAGTGG 1440 
CTCACATTTG TAATCCCAGC ACTTTGGGGA GCCAAGTTAG CAGAATCCCT TGAGCACAGG ' 1500 

AATCTCAAAC CAGCTTQGGC AACATAGTGA GATCCCATCT NTACAAAAAA CTTAAAAATT 1560 

AGCCAGGCAT GATGGCACAT TCCTGTAGTC CTAGCTACTC AGGAGGCTAA GGTAGGAGGA 1620 

15 TTGCCTGAGC CCAGGAGTTC AAGGCTGCAG TGAGCTAAGN ACGTGCCAGT ACACTCCAGC 1680 

CTGAGCCACA AAGTCAGACC CTGTCTCGCA AAAAAAAAAN TTAAAAAGTC GGGGGGGGGC 1740 

CCGGTACCCA AATCGCCGGA TATGATCGTA AACAATC 1777 

20 



10 



25 



45 



(2) INFORMATION FOR SEQ ID NO: 139: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 643 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
30 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 

TTirTTTTTT 'rrri'riTTTT m ' ri ' l ' i ' i " ! " ! ' TTTTTTTGGG AATGAGAAAA TAACTTTATT 60 

35 

TTCATrGTGG GGAGCGGGCC GATGTCCAGC CTCAGAACTT CTGGAACTGC TrCTTGGTGC 120 

CGGCAGCCTT GGTGACCITG AGCACGTTGA AGCGCACTGT CTTGCTCAGA GGCCGGCACT 180 

40 CGCCCACTGT GACGATGTCA CCGATCTGGA CGTCCCTGAA GCAGGGGGAC AGGTGTACAG 240 

ACATGTTCTT GTGGCGCTTC TCGAAGCGGT TGTACTTGCG GATGTAGTGC AGATAGTCTC 300 

GGCGGATtSAC AATCGTCCTC TGCATCTTCA TCTTGGGTCA CCACGCCAGA GAGGATCCGC 360 

CCTCGAATGG ACACATTACC AGTGAAGGGG CATTTCTTGT CAATGTAGGT GCCCCTCAAT 420 

AGCCTCCTTG GGGTGTCTTT GAAGCCCAGA CCGATGTTCT TCTTAGTAAC CCGCGGGAGC 480 

50 rrCTCCTTGC CAGTTTCTCC CAGCAGGACC CTCTTCTTGT TTTGAAAGAT GGTCGGCTGC 540 
TTrrCGTAGG CACGCTCAGT CTGAATGTCC GCCATCTTCT CGTGCCOIAY TCCTGCAGCC 



600 



55 



CGGGGGATCC ACTAGTTCTA GAGCGGCCGC ACCGCGGTGG AGC 643 



(2) INFORMATION FOR SEQ ID NO: 140: 

60 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1220 base pairs 
{B) TYPE: nucleic acid 

(C) STRANDEENESS: double 

(D) TOPOIjOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140: 

GGCACGAGGA TGATAGACCT ACTGGAGGAA TACATGGTTT ACAGGAAGCA TACCTACATR " 60 

AGGCTTGATG GCTCATCCAA GATCTCGGAG AGGCGAGACA TGGTTGCTGA TTTTCAGAAC 120 

T^AATGACA TCTTTGTGTT CCTGTTAAGC ACACGAGCTG GAGGACTGGG TATCAATCTC 180 

15 ACTGCTGMAG ACACAGTGCA TTTTCTATGA TAGCGACTGG AACCCCACTG TGGACCAGCA 240 

GGCCATGGAC AGGGCCCACC GCTTAGGGCA GACAAAGCAG GTTACTGTGT ACCGGCTCAT 300 

CTCTAAAGGC ACCATTGAAG AACGCATTCT GCAAAGAGCC AAGGAGAAGA GTGAGATTCA 360 

20 

GCGGATGGTG ATTTCAGGTG GGAACTTCAA ACCAGATACC TTGAAACCCA AAGAGGTGGT 420 

TAGTCTTCTT CTAGACGACG AAGAGTTGGA GAAGAAACGT ATGTACTCTA AACCTCTATA 480 

25 CACTCCCCTC ACGTATCTGA GAATGGAAGA GGTACTTGGS TGTGTGCCAA GGGTTAGGCA 540 

AAGCCAGAGG CTGTATTTAG GGAAAGTATT TTTGTGCTCA TATTTTATAT AAAAACCCAA 600 

ACAAGAATGT GTTTGTAGGC CAGGCGTGGT GGCTCGCGCC TCTAGTCTCA GCATTTCGGG 660 

30 

ARGCCAAAGT GGGCAGATCA CCTGARGTCA GGARTTTGAG TTTGARACCA GCCTGGCCMA 720 

CGTTGTGAAA CCCCACCTCT ACTARGARTA CSGAAAATTG GTTGGGCATG GTGGCGGGCA 780 

35 CCTGTAATTC CAGCACTTTG GGAGGCTGGG GCAGAANAAT TGCTTGAGCC CAGGAGGTGG 840 

AGATTGCGGT GAGCCGAGAT YGT(2CCATTG CAlfTCCAGCC SGGGCAATAA GAGTGAAAYT 900 

CCATCTTTTA AAAACAAACA AAAACAAAAA ACACAAGACG GCTCACACCT GTAATCCCAG 960 

40 

CACTTTGGGA RGCCGARGCA GGTGGATCAC GARGTCAGGA GTTCCAAGAC TAGCCTGGCC 1020 

AACCTGGTGA AGCCCCGTCT CTACTAAAAA TACMAATATT AGTCGGGCGT GGTGGTGGGC 1080 

45 ACGTGTAATC CCAGCTACTC GGGAGGCTGA GGCAGGAGAA TCCCTTGAAG CTAGGAGGCA 1140 

GAGGTTGCAG TGAGCCAGGA TCGTGCCATT GCACTCCAGC CTGGACAACA AGAGCAAGAT 1200 

TCCATCTCAA AAAAAAAAAA 1220 

50 



55 



(2) INFORMATION FOR SEQ ID NO: 141: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 721 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 
60 (D) TOPOLOGY: linefiu: 
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(2) INFORMATION FOR SEQ ID NO; 142; 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: 

AATTCGGCAC GAGCCAGGTT AGCCGGAAGG GCAGCTCTCC AGGCCCTGCC CACCCCACAG 60 

5 

GGGGCTCCTT A-TCCACAGCG GGGCGTCTCC TTGTGGCCAT AGAAACGGAA CTGGCTCTTT 120 

TCAACAGTGC TGCAAGAGGA TGGTTATTTA ACGCTGGCCC CCAAGGAGGA AAGGCACAGA 180 

10 CYTTCCTCCC TCCTCGAACA TCCAAGGGCA CTGGATCCTC TGTGTCCCTC TGAGATGGGG 240 

TGCCACTCCA GCAAGAGCAC CACGGTGGCA GCTGAGTCCC AGAAGCTTGA AGAAGAGYGC 300 

GAGGGAAGAG AGCCAGGTCT GGAGACCGGC ACCCAGGCAG CAGACTGCAA GGATGCCCCG 360 

CTGAAGGATG GAACCCCTGA GCCAAAGAGC TGAAATGCCT CTCTCCAGAG TCGGACCCTC 420 

ACCTCYTTCC TGGAACTGCC TTTGGCCCCA GAACCATGAG ACAATCCCCA CCCTGAGAAG 480 

20 CTCCGATCAC TOGGAGGAGA GAGAAAGCCT CCAGCTTTGG GATTCAGGCT TCAGAAGTTT 540 

TTAGCAGCCT TTGCTCATTG GAGAGGTGGG GAAAGGATAA AGTTCTTATA AGGAAATCCC 600 

TAATTTCCCC CAGCTCCTCC CCNCCNGAAG AAGGAACNAA AGAAAGTTCC TTCCACACGT 660 

TTTGTTGGAA ACTTTTCCCT TGCCAACTTT CCITGGATTG CCAGAACAAA GCCCTCCAGA 720 



721 



35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1468 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

40 

ixi) SEQUENCE DESCRIPTICM : SEQ ID NO: 142: 
ATGAATTAAT GTTTATAAAT GACTGTACTG AATTTAAAAC CGTACAGTTT CATTTGCATT 60 
45 TIGACATTAC TTTATTATAC AITTTGCATT TAAAAGGCTG CACCAGTTGG CTTTTCITCT 120 
GTTTTATrCT CAAAATATAG AGATTCTGTG ATTTATTTGC CCTGTTTATG GATTAAAAAG 180 
AAAATTCTAA TATAAAGCAT TTCAATAGGA TGCATAGGTA TATTACGTTT TTTAAATGCT 240 
TTAGATCTGT GATTCTTCAC TTACTATTTA TTTTATCCCC TTTAAGTCAG GGATGCTTTA 300 
TTCrATTTTA AAGCACTTAT GAGTTACATG TTGTAATCAA GTTTGCACAA TATATTTATC 360 
55 TATATCAGGA ACCCATAAAT GAATAGCTAA TTTTTAAAAT GCCATTAAAA TGCATGAAAT 420 
KCTTATTAAA ACCTTACTAT ACTATTTCTT CAAGGCAAGT AAATTGACCA TGRGRAAAGR 480 
ACACAGTTAT TAAACACTGT TGACAGGAAA ATTCTCCTTG ATAACATAGG ACAATTAATG 540 

60 
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GAAAAAAAAA TTCTCATTAT TTGCAAAGAA TGAACAAGTT AATGAACAAA CAAACTAGAT 
TTGGTATGTT TTCAGCTTTT GTATCATGTT TAATTGTTTA ATTTGGTTGA AAAACTGCAG 
TTGAGAAATC AGATAGCAAT ATAGACATTC ACAGCAGCTC TGTGGATACC ATGTAATTGT 
CAGGTAATTT CAGAATGriTG AAAATTATTC AGTGCAGCCC TCATAGTATC ATACTTGAAG 
AAATTGATTA CAGTTCCACT AAATTGTTGA AGATAAATTA TTTTTAAAGG TTATGAAAAC 
TAAGTTATAT TAATTCATAT GTTTGATTTT TAAATCCCAC CTCCTCAAGC TATCCAATTT 
NCTGACTTTG AAAATAACCA TGAGAGATGC CACATTTCTC TCTGGGAAAC TACCACTCAA 
AGAATAATTG TTAAAAATTA AGCTTTTAGG TATTAGAAGC TGTTATAAAG TATAAAATTA 
AGATATAAGC AGATCACATG TAAATCATTC CTAAAGCACA AGAAAAGAAT GTGCCTTGAT 
GTACATATAT TACTAAGTTG CCTCTCCCAG TTTACTTTAA AAATGGCTTT AAGGATAAAG 
AATAAATGTG ATAGCTGTGC ATGCATTATA TATTTGCATT TGCAAATTTC CCATTGTTTT 
AACAGCTGTG TGGCTGACTT TCAATTTTAA GACGTGAATT GACATACAGC CCATAACTTT 
ATAATGGCTG CTCATTTATC TTATCTTTCA GTTAGTGGAA AAACATTTCA ACCTGACTAA 
AATTTGGAAT TGTGTCTTTT ATGTTCCATC CTCTGTTGTT ACTAGATTTA GTTTAAAAAT 
TGTGTATGAC CATTAATGTA TGTCATAAAC ATGTAAATAA AAGATGTTGA ATCTTGTTGA 
AAAGCAWRAA AAAAAAAAAA AAACTCGA 

(2) INFORMATION FOR SEQ ID NO: 143: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 300 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 143: 
TGAATTTTTT GCCAAACTTA GTAACTCTGT TAAATATTTG GAGGATTTAA AGAACATCCC 
AGTTTGAATT CATTTCAAAC TrTTTAAATT TTTTTGTACT ATGTTTGGTT TTATTTTCCT 
TCTGTTAATC TTTTGTATTC RCTTATGCTC TCGTACATTG AGTACTTTTA TTCCAAAACT 
AGTGGGTTTT CTCTACTGGA AATTTTCAAT AAACCTGTCA TTATTGCTTA CTTTGATTAA 
AAAAAAAAAA AAAAAAAAAA AAACCCCNAG QGGGGGGCCG GGTNCCCAAT CCCCCCCAAA 



C2) INFORMATION FOR SEQ ID NO: 144: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2243 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144: 

CCTGCAGATT GTGGACAGTA GTTCCTCAGC CTGCACCCTG GATTCCTTCT 60 

TCCCCTTCCT AGCTCCATGG GACTCGCCCC AAGACTGTGG CTTCAAGGAC CACCAGCCCC 120 

TTACTCTTCA AGCCCTGACT GTGGAGTTGG TAGATGCCTC TGATCCTCAG TATTCTCTCT 180 

GGCAATGTTC CACGGCTTCT CCTTCCTGGG AGCTGGCTCC ATAACTTGAT TTTCCCCAAA 240 

CGTGTTGCAA TCCCTGCTGC CCCTTAGCCA CCCAGGGTCT TGTGTGGGTA TGAGTGTAGA 300 

GGATGGGGGT ATGCCAGGCC TGGGCCGTCC CAGGCAGGCC CGCTGGACCC TGATGCTACT 360 

CCTATCCACT GCCATGTACG GTGCCCATGC CCCATTGCTG GCACTGTGCC ATGTGGACGG 420 

CCGAGTGCCC TTYCGGCCCT CCTCAGCCGT GCTGCTGACT GAGCTGACCA AGCTACTGTT 480 

ATGCGCCITC TCCCTTCTGG TAGGCTGGCA AGCATGGCCC CAGGGGCCCC CACCCTGGCG 540 

CCAGGCTGCT CCCTTCGCAC TATCAGCCCT GCTCTATGGC GCTAACAACA ACCTGGTGAT 600 

CTATCTTCAG CGTTACATGG ACCCCAGCAC CTACCAGGTG CTGAGTAATC TCAAGATTGG 660 

AAGCACAGCT GTGCTCTACT GCCTCTGCCT CCGGCACCGC CTCTCTGTGC GTCAGGGGTT 720 

AGCGCTGCTG CTGCTGATGG CTGCGGGAGC CTGCTATGCA GCAGGGGGCC TTCAAGTTCC 780 

CGGGAACACC CTTCCCAGTC CCCCTCCAGC AGCTGCTGCC AGCCCCATGC CCCTGCATAT 840 

CACTCCGCTA GGCCTGCTGC TCCTCATTCT GTACTGCCTC ATCTCAGGCT TGTCGTCAGT 900 

GTACACAGAG CTGCTCATGA AGCGACAGNG GCTGCCCCTG GCACTTCAGA ACCTCTTCCT 960 

CTACACTTTT GGTGTGCTTC TGAATCTAiGG TCTGCATGCT GGCGGCGGCT CTGGCCCAGG 1020 

SCTCCTGGAA GGTTTCTCAG GATGGGCAGC ACTCGTGGTG CTGAGCCAGG CACTAAATGG 1080 

ACTGCTCATG TCTGCTGTCA TGAAGCATGG CAGCAGCATC ACACGCCTCT TTGTGGTGTC 1140 

CrcCTCGCTG GTGGTCAACG CCGTGCTCTC AGCAGTCCTG CTACGGCTGC AGCTCACAGC 1200 

CGCCTTCTTC CTGGCCACAT TGCTCATTGG CCTGGCCATG CGCCTGTACT ATGGCAGCCG 1260 

CTAGTCCCTG ACAACTTCCA CCCTGATTCC GGACCCTGTA GATTGGGCGC CACCACCAGA 1320 

TCCCCCTCCC AGGCCTTCCT CCCTCTCCCA TCAGCAGCCC TGTAACAAGT GCCTTGTGAG 1380 

AAAAGCTCGA GAAGTGAGGG CAGCCAGGTT ATTCTCTGGA GGTTGGTGGA TGAAGGGGTA 1440 

CCCCTAGGAG ATCTGAAGTG TGGGTTTGGT TAAGGAAATG CTTACCATCC CCCACCCCCA 1500 

ACCAAGTTCT TCCAGACTAA AGAATTAAGG TAACATCAAT ACCTAGGCCT GAGAAATAAC 1560 
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10 



CCCATCCTTG TTGGGCAGCT CCCTGCTTTG TCCTGCATGA ACAGAGTTGA TGAAAGTGGG 1620 

GTGTGGGCAA CAAGTGGCTT TCCTTGCCTA CTTTAGTCAC CCAGCAGAGC CACTGGAGCT 1680 

GGCTAGTCCA GCCCAGCCAT GGTGCATGAC TCTTCCATAA GGGATCCTCA CCCTTCCACT 1740 

TTCATGCAAG AAGGCCCAGT TGCCACAGAT TATACAACCA TTACCCAAAC CACTCTGACA 1800 
GTCTCCTCCA GTTCCAGCAA TGCCTAGAGA CATGCTCCCT GCCCTCTCCA CAGTGCTGCT ' 1860 

CCCCACACCT AGCCTTTGTT CTGGAAACCC CAGAGAGGGC TGGGCTTGAC TCATCTCAGG 1920 

GAATGTAGCC CCTGGGCCCT GGCTTAAGCC GACACTCCTG ACCTCTCTGT TCACCCTGAG 1980 

15 GGCTGTCTTG AAGCCCGCTA CCCACTCTGA GGCTCCTAGG AGGTACCATG CTTCCCACTC 2040 

TGGGGCCTGC CCCTGCCTAG CAGTCTCCCA GCTCCCAACA GCCTGGGGAA GCTCTGCACA 2100 

GAGTGACCTG AGACCAGGTA CAGGAAACCT GTAGCTCAAT CAGTGTCTCT V/TAACTGCAT 2160 

AAGCAATAAG ATCTTAATAA AGTCTTCTAG GCTGTAGGGT GGTTCCTACA ACCACAGCCA 2220 

AAAAAAAAAA AAAAAAACTC GAG 2243 



20 



25 



(2) INFORMATION FOR SEQ ID NO: 145: 



30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1082 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: 
GCCAAGCTCT AATACGACTC ACTATAGGGA AAGCTGGTAC GCCTGCAGKT ACCGGTTCCG 60 
40 GGAATTCCCG GGTCGACCCA CGCGTCCGCT TCCGTGTGTC AAAATCCTCA CCTCCTTCAT 120 
AACCATCTCC CACAATTAAT TCTTGACTAT ATAAATTTAT GGTTTGATAA TATTATCAAT 180 
TTGTAA1CAA TTCAGATTTC TTTAGTGCTT GCTTTTCTGT GACTCAACTG CCCA3ACACC 240 

45 

. TCATTGTACT TGAAAACTGG AACANCTTGG GAATGCCATG GGGTTTGATA ATCTGCCAGG 300 
GACATGAAGA GGCTCAGCTT CCTGGGACCA TGACTTTGGC TCAGCTGATC CTGNACATGG 360 
50 GAGAACAACC ACATTTTTCT TTGTGTGTGC TTCTAGCAGC TGTTCGGGAG GACCKTGACC 420 
CAAYAGTGTT CCCATGCTGT TTCTTGTGAA ATGCTCTCGG CTATGTAGCA GCTTTTGATT 480 
CCCTGCATAC CCTAGGCTGC TGCCCCTATC CTGTCCCTTG TTTATAACAT TGAGAGGTTT 540 

55 

TCTAGGGCAC ATACTGAGTG AGAGCAGTGT TGAGAAGTCG GGGAAAATGG TGACTACTTT 600 
TAGAGCAAGG CTGGGCATCA GCACCTGTCC AGCTCTACTT GTGTGATGTT TCAGGAACTC 660 
60 AGCCCCTTTT TCTGCCTAGG ATAAGGAGCT GAAAGATTAA CTTGGATCTY CTAATGGTCC 720 
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AAATCTITTG GTCACAATAA AGAGTCTCCA 
ATTTGGTGGC CTGACATGAT ACTCTGCCAG 

5 

GGCCAAGCTC TCTGCAAATG GAAATGCTTA 
TGCTATTTTT GTGGTTTTGG TTCTCCCACT 
10 TGTCATGTCA GCCCXTATTGA CTACCTTCTC 
CAAATTTCTA TTTCTGTCAA TAAAAGGAGA 
NG 

15 



AATTAGAGAC TGCATGTTAG TTCTGGATGG 780 

CTGTGAQGGG ACCCCGTTTT TAAGATGCAT 840 

CACTGGGTGT TGGGGATGTT TGC7ACCTCC 900 

ATGGTAGGAC CCCTGGCCAG CATTGTGGCT 960 

ATGCTCTGAG GTACTACTGC CTCTGCAGCA 1020 

TGAAAATAAA AAANAAAAAA AAAAAACTCG 1080 

1082 



20 



(2) INFORMATION FOR SEQ ID NO: 146: 



40 



50 



60 



180 
240 
300 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4313 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
25 CD) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 146: 

CAAGCTGGTT TGAAACTAGG GGTCGGGCTC GGCCGTCGTC GTTGTTTGTC GCCGCATCCC 

30 

CGCTTCCGGG TTAGGCCGTT CCTGCCCGCC CCCTCCTCTC CTCCCTTCGG ACCC^TAGAT 120 
CTCAGGCTCG GCTCCCCGCC CGCCGCAGCC CACTGTTGAC CCGGCCCGTA CTC-CGGCCCC 
35 GTGGCCACCA TGTCCCTGCA CGGCAAACGG AAGGAGATCT ACAAGTATGA AC-CC-CCCTGG 
ACAGTCTACG CGATGAACTG GAGTGTGCGG CCCGATAAGC GCTTTCGCTT GGCCCTGGGC 
AGCTTCGTGG AGGAGTACAA CAACAAGGTT CAGCTTGTTG GTTTAGATGA GGAGAGTTCA 350 
GAGTITATTT GCAGAAACAC CTTTGACCAC CCATACCCCA CCACAAAGCT CATGTGGATC 420 
CCTGACACAA AAGGCGTCTA TCCAGACCTA CTGGCAACAA GCGGTGACTA TCTCCGTGTG 480 
45 TGGAGGGTTG GTOAAACAGA GACCAGGCTG GAGTGTTrGC TAAACAATAA TAAGAACTCT 540 

GArrrCTCTG ctcccctcac ctcctttgac tggaatgagg tggatcctta tcttttaggt 

ACCTCAAGCA TTGATACGAC ATGCACCATC TGGGGGCTGG AGACAGGGCA GGTGTTAGGG 
CGAGTCAATC TCGIGTCTGG CCACGTGAAG ACCCAGCTGA TCGCCCATGA CAAAGAGGTC 720 
TATGATATTC CATTTAGCCG GGCCGGGGGT GGCAGGGACA TGTTTGCCTC TGTGGGTGCT 780 
55 GATGGCTCGG TCCGGATOTT TGACCTCCGC CATCTAGAAC ACAGCACCAT CATITACGAA 
GACCCACAGC ATCACCCACT GCTTCGCCTC TGCTGGAACA AGCAGGACCC TAACTACCTG 
GCCACCATGG CCATCGATGG AATGGAGGTG GTGATTCTAG ATGTCCGGGT TCCTGCACAC 

60 



600 
660 



840 
900 
950 
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CTGTSGCCAG GTTAAACAAC CATCGAGCAT 
CATCCTGCCA CATCTGCACT GCAGCGGATG 
5 AAATGCCCCG AGCCATTGAG GACCCTATCC 
AATGTGCAGT GGGCATCAAC TCAGCCCGAA 
AGATACTCAG AGTGTAGTGT TGGTGGCGCT 

10 

CTGCCTCTGC CCCACCCCCA AAGTAAGAAG 
CATTGCTTTG CACCCACTGT TACCAGAAGC 
15 TCGCCCTCTG TGGCAGACTC AGTGCTGTGT 
AGATTTTCTC TCCTTTCCTC TTCTCCTTTG 
GTTTGTCAGG CGTTGTGTTG AGGAGCAGTT 

20 

AGGTGTCTCT GTTTGCTGCC CAAKGYWKKT 
TTAGCACTWA CGTGGGAACA AATACCAATT 
25 CAAAITTTAA CTTTGTATAT TTGTTATCTA 
ACTCTCCTGC TTCATTTCTT TGTCTTATAG 
TCAGTGCCTG GAGCTGGTAC TGGGCCCCTG 

30 

CTGCXrrGTGT AGTACATACC TGACCGGGAG 
TGACTCATCA CACCTTTCTT AGCCTGGCTC 
35 ACATAGGAAG CCTCTGTTTA CCCTGAAGCA 
AGCATGGTAG AGCTGAGAGA AACAGGCTCT 
ATGAAGCTGA ACTTCAAGCA TATTTCCAGT 

40 

AATATAAGCC CCAGGCCATT CCACTTAGTG 
GAGTTGAACT TCGGTGCTTC TGTTGTTTGA 
45 TCTTTGGATT GAGTGTTCTG AGGTGAGAGA 
AACCCTGAAC AAGACCTTAC ATGAGAGATG 
CAAGTGGATA GATAGTTAAA AAGCATTATA 

50 

AGAAGGAAAA GGAATTATAG ACCCCCAGGG 
TCAACCCCTC TCTCCCCCAG TTTAGGTTCT 
55 TCTTTTGACT TGCAGGCCGC AGTGTCTTTC 
TATGTGTGAT TCCACCGTTA GATGAGCCCT 
GGGAAAGTTG GCTGTTTCCT TGCGCTCTGC 

60 



GTGTCAATGG CATTGCTTGG GCCCCACATT 1020 

ACCACCAGGC TCTCATCTGG GACATCCAGC 1080 

TGGCCTACAC AGCTGNAAGG WGAGATCAAC 1140 

YTGTCGCCAT CTGCTACAAC AACTGCCTGG 1200 

GTGCCCACGA GGCAGGGGCT TTrGTATTTC 1260 

AAACATGTTT CCAGTGGCCA GTATGTCTTT 1320 

TGCTCTAGGA GTTCCTGGCC AGTCACCCCA 1380 

GGCGCCTCCT CAGCCCAGGG CTGAGTTTTA 1440 

GTTCCTCAAT TAAAAAATGT GTGTATATTT 1500 

CACGCACTGG CTGTGTCTAT TCCTCTGCCC 1560 

TTTCATGTCT CGTCCATGTC CATGTTCGTG 1620 

TGTCTTTTCT CCTAGTATCA GTGTGTTTAA 1680 

TCAGGCTAAT TTTTTTATGA AAAGAATTTT 1740 

TCCTCCCTCT TTGCACCTTC TTCTCTTCCC 1800 

GCCCCATGAG CAGTTTGCCT TCTTGAGTCA 1860 

TCCAAACCAC CTTGGTGCTC TGAAGTCCAC 1920 

CTCTCAAGGG CATTCTGGGC TTGTAAACAG 1980 

CCACTGTCCA GCCCATTGGT TCCCACTGGC 2040 

CAGGGTACCT GACTTGAGGG GAATCGTTTC 2100 

ACATTCTTTC AGAGTCTGTT TTTCCATCCA 2160 

TCTTTTCAAT GATAGGCAAG AATGATATCT 2220 

GTTTACTGTG CCTGGTGGTA TATTGGGCAT 2280 

GTCTTCCCGA GGCATCCTGT CTGTGCTTCC 2340 

GACTGATGGA CTGCGGCAAT CCTGGGCTGT 2400 

CTGTGGGTAA TGAAAAGGGA GGAAAAAAAA 2460 

TCAGCCAGTT AAGAGCTCTA CCCAGACCTG 2520 

GAGCAGTATT GGACTTGTAG CCTGCAGTTG 2580 

TGTTATGTGA ATGAGTTCCA TGGAGGGGCA 2640 

TGGGGCAGGC AGTTTGGGAT GTGCTCTTGG 2700 

TCCTACCCGA AGTITrTAAG TCCCTCTGAA 2760 
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TTCCTCATCT GAGATTAGTA GAGTAGCAGG 
TTCTCACCTG CTTGAGAAGT AAAACAGTAA 
5 GGTTAAGTCT TCCTTTTCAG AAGTAGATGT 
TTACCATACA GGGACCTGTC CCAAAGAAAA 
TTCTACCCTT TTACrTTGTT GTTCTGATTT 

10 

TGCCTCTCCT GCATTTGCCA CTGGATTTGC 
GTTCTTGGTC AGAACCCTCC TCTGCTTTTC 
15 CTCTCAAGGG TAGCAAGGCC AAGCTGATGG 
CTGTGGAGAA GGGTCTGAAA TGGAAGTCAG 
CTTACATCCA CTGAGTTCTA AGATTCCTTT 

20 

GTGGAATTTG TCAGCTQGAA CTCAGAAACA 
ATATTTGCAT AAGATAGCTA TTTACTCTGG 
25 CTGTGGACGC CCAGCTCCTG TCATCCTTCC 
TGCCACCGGG GACCCAGGGG GACTCCACCC 
GATGAGTTGC TGGTCTTTGA GTCCCAGCTC 

30 

CGACCCATGA CTGAGGAGGG GATTTCTACA 
CCATGCTCCA GAAAGCACCG ATCTGTTGTA 
35 GTTCTCAAAC TGACAGCCAG CGAGACTGGG 
GCGGGAGGAG CAGCCACTAG GACTTTAGCA 
TGGCCCAGCT GGTGATGGCC dTITGCTCC 

40 

TCCTCATCTG TTCTGACTGA AGGATGGAGG 
CACCAGAGAG CTGGAGAATG GGTCCACGTC 
45 AGCATTGGAA TCCTCTTCTT CCAGGGAGGA 
GAAGGTATTT AATAACTGGG CGAGGATGGG 
TTTGQGAGGC TGAGGTGGCC AGATCCCAAG 

50 

TGGTGAAACC CCATCTCTAC TAAAAATACA 



CCTGAAGGAT GATGGTTTTG TCCTCTTTGG 2820 

CTTTGTTCTT CTGGGCCCTT AAGCTTTTTT 2880 

CATTATATGC CAAAAGTCTA GCTCTTTGCT 2940 

AGGCTCTTTT TTTAGCCAGC ATATTTCCCC 3000 
TAGGACTCTG GCTGGCCATG TGCTTGTGGT * 3060 

ACTGCATCGT TTGGAGATAC AAAGCGAGCA 3120 

ATTGTGTTTG ATAATGGTTA CTGGGTCCTT 3180 

CTGCTTGTTT AGGAGGCCAT CAGTTCCTTC 3240 

1GGTAGAAGG GGCTGGTCTG CTGGGCAGGG 3300 

CCTGATCTGC ACCTACGCCT GGTCTGTATG 3360 

ACAACTTGAA AAAAAAATAA TAATTAGAAC 3420 

AAACCAACAA CTTTTGAGAT 1TCCCTTGCC 3480 

TTAGGTCCTG CAGTACAGTC TTCCCCTGAA 3540 

CCCTAAGCAA GCACACACAT ACTCACAGTT 3600 

TCTTACCCTC CCTTTACTCC ACCAGCCCGA 3660 

GTCTCAGGAT TTAGAAAGTC TGTAAGCCAT 3720 

GTTGCAAAAA CAACTCTGTA ATTTGTTGAG 3780 

TGGGAGGCCC TGGATCTGTT CTCCCTGACT 3840 

GGAAGCCCAC ATGGAGGCTC CGCCAGGCTG 3900 

TCGCAGCCTG AGGCACAGCT GCCTGTATTG 3960 

TGCTGAATAA ATTAGGCCTC AGGCNTCTAC 4020 

ATTCAAGGAC CTGAATTTTT TATGCTCAGG 4080 

ATTAGCCTGC AAGGTTAGGA CTTGAAGAGG 4140 

TCTGGTGGCT CACACCTGTA ATCCCAGCAT 4200 

GTCAGAAGAT CGAGACCATC CTGGCTAACA 4260 

AAATTAAATT GGCCGGGCGT GAA 4313 



55 

(2) INFORMATION FOR SEQ ID NO: 147: 

(i) SEQOENCE CHARACTERISTICS: 

(A) . LEETCTH: 1183 base pairs 
60 (B) TYPE: nucleic acid 



